Code for NeurIPS 2021 paper: Invariant Causal Imitation Learning for Generalizable Policies

Last update: Dec 01, 2022

Overview

Invariant Causal Imitation Learning for Generalizable Policies

Ioana Bica, Daniel Jarrett, Mihaela van der Schaar

Neural Information Processing Systems (NeurIPS) 2021

Dependencies

The code was implemented in Python 3.6 and the following packages are needed for running it:

gym==0.17.2
numpy==1.18.2
pandas==1.0.4
tensorflow==1.15.0
torch==1.6.0
tqdm==4.32.1
scipy==1.1.0
scikit-learn==0.22.2
stable-baselines==2.10.1

Running and evaluating the model:

The control tasks used for experiments are from OpenAI gym [1]. Each control task is associated with a true reward function (unknown to the imitation algorithm). In each case, the “expert” demonstrator can be obtained by using a pre-trained and hyperparameter-optimized agent from the RL Baselines Zoo [2] in Stable OpenAI Baselines [3].

In this implementation we provide the expert demonstrations for 2 environments for CartPole-v1 in 'volume/CartPole-v1'. Note that the code in 'contrib/baselines_zoo' was taken from [2].

To train and evaluate ICIL on CartPole-v1, run the following command with the chosen command line arguments. For reference, the expert performance is 500.

python testing/il.py

Options :
   --env                  # Environment name. 
   --num_trajectories	  # Number of expert trajectories used for training the imitation learning algorithm. 
   --trial                # Trial number.

Outputs:

Average reward for 10 repetitions of running ICIL.

Example usage

python testing/il.py  --env='CartPole-v1' --num_trajectories=20 --trial=0

References

[1] Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. Openai gym. OpenAI, 2016

[2] Antonin Raffin. Rl baselines zoo. https://github.com/araffin/rl-baselines-zoo, 2018

[3] Ashley Hill, Antonin Raffin, Maximilian Ernestus, Adam Gleave, Anssi Kanervisto, Rene Traore, Prafulla Dhariwal, Christopher Hesse, Oleg Klimov, Alex Nichol, Matthias Plappert, Alec Radford, John Schulman, Szymon Sidor, and Yuhuai Wu. Stable baselines. https://github.com/hill-a/stable-baselines, 2018.

Citation

If you use this code, please cite:

@inproceedings{bica2021invariant,
  title={Invariant Causal Imitation Learning for Generalizable Policies},
  author={Bica, Ioana and Jarrett, Daniel and van der Schaar, Mihaela},
  booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
  year={2021}
}

Code for NeurIPS 2021 paper: Invariant Causal Imitation Learning for Generalizable Policies

Related tags

Overview

Invariant Causal Imitation Learning for Generalizable Policies

Ioana Bica, Daniel Jarrett, Mihaela van der Schaar

Neural Information Processing Systems (NeurIPS) 2021

Dependencies

Running and evaluating the model:

Example usage

References

Citation

Owner

Ioana Bica

Flow is a computational framework for deep RL and control experiments for traffic microsimulation.

iNAS: Integral NAS for Device-Aware Salient Object Detection

Anti-UAV base on PaddleDetection

Open source code for the paper of Neural Sparse Voxel Fields.

Optimizing Value-at-Risk and Conditional Value-at-Risk of Black Box Functions with Lacing Values (LV)

Knowledge Management for Humans using Machine Learning & Tags

Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference

code associated with ACL 2021 DExperts paper

Source code of the paper PatchGraph: In-hand tactile tracking with learned surface normals.

TLDR: Twin Learning for Dimensionality Reduction

Data-driven reduced order modeling for nonlinear dynamical systems

Convert human motion from video to .bvh

Make a surveillance camera from your raspberry pi!

Answer a series of contextually-dependent questions like they may occur in natural human-to-human conversations.

Repo for the paper "DiLBERT: Cheap Embeddings for Disease Related Medical NLP"

Multi-task Multi-agent Soft Actor Critic for SMAC

Point Cloud Registration using Representative Overlapping Points.

3D AffordanceNet is a 3D point cloud benchmark consisting of 23k shapes from 23 semantic object categories, annotated with 56k affordance annotations and covering 18 visual affordance categories.

A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

(Python, R, C/C++) Isolation Forest and variations such as SCiForest and EIF, with some additions (outlier detection + similarity + NA imputation)