PyTorch code for DriveGAN: Towards a Controllable High-Quality Neural Simulation

Overview

DriveGAN: Towards a Controllable High-Quality Neural Simulation

PyTorch code for DriveGAN

DriveGAN: Towards a Controllable High-Quality Neural Simulation
Seung Wook Kim, Jonah Philion, Antonio Torralba, Sanja Fidler
CVPR (oral), 2021
[Paper] [Project Page]

Abstract: Realistic simulators are critical for training and verifying robotics systems. While most of the contemporary simulators are hand-crafted, a scaleable way to build simulators is to use machine learning to learn how the environment behaves in response to an action, directly from data. In this work, we aim to learn to simulate a dynamic environment directly in pixel-space, by watching unannotated sequences of frames and their associated action pairs. We introduce a novel high-quality neural simulator referred to as DriveGAN that achieves controllability by disentangling different components without supervision. In addition to steering controls, it also includes controls for sampling features of a scene, such as the weather as well as the location of non-player objects. Since DriveGAN is a fully differentiable simulator, it further allows for re-simulation of a given video sequence, offering an agent to drive through a recorded scene again, possibly taking different actions. We train DriveGAN on multiple datasets, including 160 hours of real-world driving data. We showcase that our approach greatly surpasses the performance of previous data-driven simulators, and allows for new features not explored before.

For business inquires, please contact [email protected]

For press and other inquireis, please contact Hector Marinez at [email protected]

Citation

  • If you found this codebase useful in your research, please cite:
@inproceedings{kim2021drivegan,
  title={DriveGAN: Towards a Controllable High-Quality Neural Simulation},
  author={Kim, Seung Wook and Philion, Jonah and Torralba, Antonio and Fidler, Sanja},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={5820--5829},
  year={2021}
}

Environment Setup

This codebase is tested with Ubuntu 18.04 and python 3.6.9, but it most likely would work with other close python3 versions.

  • Clone the repository
git clone https://github.com/nv-tlabs/DriveGAN_code.git
cd DriveGAN_code
  • Install dependencies
pip install -r requirements.txt

Data

We provide a dataset derived from Carla Simulator (https://carla.org/, https://github.com/carla-simulator/carla). This dataset is distributed under Creative Commons Attribution-NonCommercial 4.0 International Public LicenseCC BY-NC 4.0

All data are stored in the following link: https://drive.google.com/drive/folders/1fGM6KVzBL9M-6r7058fqyVnNcHVnYoJ3?usp=sharing

Training

Stage 1 (VAE-GAN)

If you want to skip stage 1 training, go to the Stage 2 (Dynamics Engine) section. For stage 1 training, download {0-5}.tar.gz from the link and extract. The extracted datasets have names starting with 6405 - change their name to data1 (for 0.tar.gz) to data6 (for 5.tar.gz).

cd DriveGAN_code/latent_decoder_model
mkdir img_data && cd img_data
tar -xvzf {0-5}.tar.gz
mv 6405x data{1-6}

Then, run

./scripts/train.sh ./img_data/data1,./img_data/data2,./img_data/data3,./img_data/data4,./img_data/data5,./img_data/data6

You can monitor training progress with tensorboard in the log_dir specified in train.sh

When validation loss converges, you can now encode the dataset with the learned model (located in log_dir from training)

./scripts/encode.sh ${path to saved model} 1 0 ./img_data/data1,./img_data/data2,./img_data/data3,./img_data/data4,./img_data/data5,./img_data/data6 ../encoded_data/data

Stage 2 (Dynamics Engine)

If you did not do Stage 1 training, download encoded_data.tar.gz and vaegan_iter210000.pt from link, and extract.

cd DriveGAN_code
mkdir encoded_data
tar -xvzf encoded_data.tar.gz -C encoded_data

Otherwise, run

cd DriveGAN_code
./scripts/train.sh encoded_data/data ${path to saved vae-gan model}

Playing with trained model

If you want to skip training, download simulator_epoch1020.pt and vaegan_iter210000.pt from link.

To play with a trained model, run

./scripts/play/server.sh ${path to saved dynamics engine} ${port e.g. 8888} ${path to saved vae-gan model}

Now you can navigate to localhost:{port} on your browser (tested on Chrome) and play.

(Controls - 'w': speed up, 's': slow down, 'a': steer left, 'd': steer right)

There are also additional buttons for changing contents. To sample a new scene, simply refresh the webpage.

License

Thie codebase and trained models are distributed under Nvidia Source Code License and the dataset is distributed under CC BY-NC 4.0.

Code for VAE-GAN is adapted from https://github.com/rosinality/stylegan2-pytorch (License).

Code for Lpips is imported from https://github.com/richzhang/PerceptualSimilarity (License).

StyleGAN custom ops are imported from https://github.com/NVlabs/stylegan2 (License).

Interactive UI code uses http://www.semantic-ui.com/ (License).

Efficient Training of Visual Transformers with Small Datasets

Official codes for "Efficient Training of Visual Transformers with Small Datasets", NerIPS 2021.

Yahui Liu 112 Dec 25, 2022
Adversarial-autoencoders - Tensorflow implementation of Adversarial Autoencoders

Adversarial Autoencoders (AAE) Tensorflow implementation of Adversarial Autoencoders (ICLR 2016) Similar to variational autoencoder (VAE), AAE imposes

Qian Ge 236 Nov 13, 2022
Implementation for the paper SMPLicit: Topology-aware Generative Model for Clothed People (CVPR 2021)

SMPLicit: Topology-aware Generative Model for Clothed People [Project] [arXiv] License Software Copyright License for non-commercial scientific resear

Enric Corona 225 Dec 13, 2022
Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition" Pre-trained Deep Convo

Ankush Malaker 5 Nov 11, 2022
DeceFL: A Principled Decentralized Federated Learning Framework

DeceFL: A Principled Decentralized Federated Learning Framework This repository comprises codes that reproduce experiments in Ye, et al (2021), which

Huazhong Artificial Intelligence Lab (HAIL) 10 May 31, 2022
Time series annotation library.

CrowdCurio Time Series Annotator Library The CrowdCurio Time Series Annotation Library implements classification tasks for time series. Features Suppo

CrowdCurio 51 Sep 15, 2022
A PyTorch implementation of ViTGAN based on paper ViTGAN: Training GANs with Vision Transformers.

ViTGAN: Training GANs with Vision Transformers A PyTorch implementation of ViTGAN based on paper ViTGAN: Training GANs with Vision Transformers. Refer

Hong-Jia Chen 127 Dec 23, 2022
Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

Human Pose Regression with Residual Log-likelihood Estimation [Paper] [arXiv] [Project Page] Human Pose Regression with Residual Log-likelihood Estima

JeffLi 347 Dec 24, 2022
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more

PyTorch Image Models Sponsors What's New Introduction Models Features Results Getting Started (Documentation) Train, Validation, Inference Scripts Awe

Ross Wightman 22.9k Jan 09, 2023
An implementation of the proximal policy optimization algorithm

PPO Pytorch C++ This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch. It uses a simple TestEnvironment t

Martin Huber 59 Dec 09, 2022
A sequence of Jupyter notebooks featuring the 12 Steps to Navier-Stokes

CFD Python Please cite as: Barba, Lorena A., and Forsyth, Gilbert F. (2018). CFD Python: the 12 steps to Navier-Stokes equations. Journal of Open Sour

Barba group 2.6k Dec 30, 2022
Exploration-Exploitation Dilemma Solving Methods

Exploration-Exploitation Dilemma Solving Methods Medium article for this repo - HERE In ths repo I implemented two techniques for tackling mentioned t

Aman Mishra 6 Jan 25, 2022
Official page of Patchwork (RA-L'21 w/ IROS'21)

Patchwork Official page of "Patchwork: Concentric Zone-based Region-wise Ground Segmentation with Ground Likelihood Estimation Using a 3D LiDAR Sensor

Hyungtae Lim 254 Jan 05, 2023
A PyTorch Toolbox for Face Recognition

FaceX-Zoo FaceX-Zoo is a PyTorch toolbox for face recognition. It provides a training module with various supervisory heads and backbones towards stat

JDAI-CV 1.6k Jan 06, 2023
Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression", TIP 2020

Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multil

Xuefeng 5 Jan 15, 2022
Papers about explainability of GNNs

Papers about explainability of GNNs

Dongsheng Luo 236 Jan 04, 2023
Official PyTorch implementation for FastDPM, a fast sampling algorithm for diffusion probabilistic models

Official PyTorch implementation for "On Fast Sampling of Diffusion Probabilistic Models". FastDPM generation on CIFAR-10, CelebA, and LSUN datasets. S

Zhifeng Kong 68 Dec 26, 2022
The sixth place winning solution (6/220) in 2021 Gaofen Challenge.

SwinTransformer + OBBDet The sixth place winning solution (6/220) in the track of Fine-grained Object Recognition in High-Resolution Optical Images, 2

ming71 46 Dec 02, 2022
PyTorch implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose

Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose Release Notes The official PyTorch implementation of Neural View S

Angtian Wang 20 Oct 09, 2022
Keras implementation of AdaBound

AdaBound for Keras Keras port of AdaBound Optimizer for PyTorch, from the paper Adaptive Gradient Methods with Dynamic Bound of Learning Rate. Usage A

Somshubra Majumdar 132 Sep 23, 2022