Datasets for new state-of-the-art challenge in disentanglement learning

Last update: May 26, 2022

Overview

High resolution disentanglement datasets

This repository contains the Falcor3D and Isaac3D datasets, which present a state-of-the-art challenge for controllable generation in terms of image resolution, photorealism, and richness of style factors, as compared to existing disentanglement datasets.

Falor3D

The Falcor3D dataset consists of 233,280 images based on the 3D scene of a living room, where each image has a resolution of 1024x1024. The meta code corresponds to all possible combinations of 7 factors of variation:

lighting_intensity (5)
lighting_x-dir (6)
lighting_y-dir (6)
lighting_z-dir (6)
camera_x-pos (6)
camera_y-pos (6)
camera_z-pos (6)

Note that the number m behind each factor represents that the factor has m possible values, uniformly sampled in the normalized range of variations [0, 1].

Each image has as filename padded_index.png where

index = lighting_intensity * 46656 + lighting_x-dir * 7776 + lighting_y-dir * 1296 + 
lighting_z-dir * 216 + camera_x-pos * 36 + camera_y-pos * 6 + camera_z-pos

padded_index = index padded with zeros such that it has 6 digits.

To see the Falcor3D images by varying each factor of variation individually, you can run

python dataset_demo.py --dataset Falor3D

and the results are saved in the examples/falcor3d_samples folder.

You can also check out the Falcor3D images here: falcor3d_samples_demo, which includes all the ground-truth latent traversals.

Isaac3D

The Isaac3D dataset consists of 737,280 images, based on the 3D scene of a kitchen, where each image has a resolution of 512x512. The meta code corresponds to all possible combinations of 9 factors of variation:

object_shape (3)
object_scale (4)
camera_height (4)
robot_x-movement (8)
robot_y-movement (5)
lighting_intensity (4)
lighting_y-dir (6)
object_color (4)
wall_color (4)

Similarly, the number m behind each factor represents that the factor has m possible values, uniformly sampled in the normalized range of variations [0, 1].

Each image has as filename padded_index.png where

index = object_shape * 245760 + object_scale * 30720 + camera_height * 6144 + 
robot_x-movement * 1536 + robot_y-movement * 384 + lighting_intensity * 96 + 
lighting_y-dir * 16 + object_color * 4 + wall color

padded_index = index padded with zeros such that it has 6 digits.

To see the Isaac3D images by varying each factor of variation individually, you can run

python dataset_demo.py --dataset Isaac3D

and the results are saved in the examples/isaac3d_samples folder.

You can also check out the Isaac3D images here: isaac3d_samples_demo, which includes all the ground-truth latent traversals.

Links to datasets

The two datasets can be downloaded from Google Drive:

Falcor3D (98 GB): link
Isaac3D (190 GB): link

Besides, we also provide a downsampled version (resolution 128x128) of the two datasets:

Falcor3D_128x128 (3.7 GB): link
Isaac3D_128x128 (13 GB): link

License

This work is licensed under a Creative Commons Attribution 4.0 International License by NVIDIA Corporation (https://creativecommons.org/licenses/by/4.0/).

Datasets for new state-of-the-art challenge in disentanglement learning

Related tags

Overview

High resolution disentanglement datasets

Falor3D

Isaac3D

Links to datasets

License

Owner

NVIDIA Research Projects

MADT: Offline Pre-trained Multi-Agent Decision Transformer

Training data extraction on GPT-2

A repository for interferometer controller code.

1st-in-MICCAI2020-CPM - Combined Radiology and Pathology Classification

Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning

Implementation of "Semi-supervised Domain Adaptive Structure Learning"

Code for KDD'20 "Generative Pre-Training of Graph Neural Networks"

Language models are open knowledge graphs ( non official implementation )

DIR-GNN - Discovering Invariant Rationales for Graph Neural Networks

make ASCII Art by Deep Learning

An AI made using artificial intelligence (AI) and machine learning algorithms (ML) .

A Dataset for Direct Quotation Extraction and Attribution in News Articles.

Decensoring Hentai with Deep Neural Networks. Formerly named DeepMindBreak.

PyTorch implementation of InstaGAN: Instance-aware Image-to-Image Translation

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.

Understanding the Generalization Benefit of Model Invariance from a Data Perspective

Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021)

PyTorch implementation for 3D human pose estimation

i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery

bio_inspired_min_nets_improve_the_performance_and_robustness_of_deep_networks

Datasets for new state-of-the-art challenge in disentanglement learning

Related tags

Overview

High resolution disentanglement datasets

Falor3D

Isaac3D

Links to datasets

License

Owner

NVIDIA Research Projects

MADT: Offline Pre-trained Multi-Agent Decision Transformer

Training data extraction on GPT-2

A repository for interferometer controller code.

1st-in-MICCAI2020-CPM - Combined Radiology and Pathology Classification

Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning

Implementation of "Semi-supervised Domain Adaptive Structure Learning"

Code for KDD'20 "Generative Pre-Training of Graph Neural Networks"

Language models are open knowledge graphs ( non official implementation )

DIR-GNN - Discovering Invariant Rationales for Graph Neural Networks

make ASCII Art by Deep Learning

An AI made using artificial intelligence (AI) and machine learning algorithms (ML) .

A Dataset for Direct Quotation Extraction and Attribution in News Articles.

Decensoring Hentai with Deep Neural Networks. Formerly named DeepMindBreak.

PyTorch implementation of InstaGAN: Instance-aware Image-to-Image Translation

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren*, Raymond A. Yeh*, Alexander G. Schwing.

Understanding the Generalization Benefit of Model Invariance from a Data Perspective

Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021)

PyTorch implementation for 3D human pose estimation

i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery

bio_inspired_min_nets_improve_the_performance_and_robustness_of_deep_networks

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.