ilpyt: imitation learning library with modular, baseline implementations in Pytorch

Last update: Nov 17, 2022

Overview

ilpyt

The imitation learning toolbox (ilpyt) contains modular implementations of common deep imitation learning algorithms in PyTorch, with unified infrastructure supporting key imitation learning and reinforcement learning algorithms. You can read more about ilpyt in our white paper.

Documentation is available here.

Main Features
Installation
Getting Started
Supported Algorithms and Environments
Benchmarks
Citation

Main Features

Implementation of baseline imitation learning algorithms: BC, DAgger, AppL, GCL, GAIL.
Implementation of baseline reinforcement learning algorithms, for comparison purposes: DQN, A2C, PPO2.
Modular, extensible framework for training, evaluating, and testing imitation learning (and reinforcement learning) algorithms.
Simple algorithm API which exposes train and test methods, allowing for quick library setup and use (a basic usage of the library requires less than ten lines of code to have a fully functioning train and test pipeline).
A modular infrastructure for easy modification and reuse of existing components for novel algorithm implementations.
Parallel and serialization modes, allowing for faster, optimized operations or serial operations for debugging.
Compatibility with the OpenAI Gym environment interface for access to many existing benchmark learning environments, as well as the flexibility to create custom environments.

Installation

Note: ilpyt has only been tested on Ubuntu 20.04, and with Python 3.8.5.

In order to install ilpyt, there are a few prerequisites required. The following commands will setup all the basics so you can run ilpyt with the OpenAI Gym environments:

# Install system-based packages
apt-get install cmake python3-pip python3-testresources freeglut3-dev xvfb

# Install Wheel
pip3 install --no-cache-dir --no-warn-script-location wheel

Install ilpyt using pip:

pip3 install ilpyt

# Or to install from source:
# pip3 install -e .

(Optional) Run the associated Python tests to confirm the package has installed successfully:

git clone https://github.com/mitre/ilpyt.git
cd ilpyt/

# To run all the tests
# If running headless, prepend the pytest command with `xvfb-run -a -s "-screen 0 1400x900x24 +extension RANDR" --`
pytest tests/

# Example: to run an individual test, like DQN
pytest tests/test_dqn.py

Getting Started

Various sample Python script(s) of how to run the toolbox can be found within the examples directory. Documentation is available here.

Basic Usage

Various sample Python script(s) of how to run the toolbox can be found within the examples directory. A minimal train and test snippet for an imitation learning algorithm takes less than 10 lines of code in ilpyt. In this basic example, we are training a behavioral cloning algorithm for 10,000 epochs before testing the best policy for 100 episodes.

import ilpyt
from ilpyt.agents.imitation_agent import ImitationAgent
from ilpyt.algos.bc import BC

env = ilpyt.envs.build_env(env_id='LunarLander-v2',  num_env=16)
net = ilpyt.nets.choose_net(env)
agent = ImitationAgent(net=net, lr=0.0001)

algo = BC(agent=agent, env=env)
algo.train(num_epochs=10000, expert_demos='demos/LunarLander-v2/demos.pkl')
algo.test(num_episodes=100)

Code Organization

At a high-level, the algorithm orchestrates the training and testing of our agent in a particular environment. During these training or testing loops, a runner will execute the agent and environment in a loop to collect (state, action, reward, next state) transitions. The individual components of a transition (e.g., state or action) are typically torch Tensors. The agent can then use this batch of transitions to update its network and move towards an optimal action policy.

Customization

To implement a new algorithm, one simply has to extend the BaseAlgorithm and BaseAgent abstract classes (for even further customization, one can even make custom networks by extending the BaseNetwork interface). Each of these components is modular (see code organization for more details), allowing components to be easily swapped out. (For example, the agent.generator used in the GAIL algorithm can be easily swapped between PPOAgent, DQNAgent, or A2Cagent. In a similar way, new algorithm implementations can utilize existing implemented classes as building blocks, or extend the class interfaces for more customization.)

Adding a custom environment is as simple as extending the OpenAI Gym Environment interface and registering it within your local gym environment registry.

See agents/base_agent.py, algos/base_algo.py, nets/base_net.py for more details.

Supported Algorithms and Environments

The following imitation learning (IL) algorithms are supported:

Behavioral Cloning (BC/ALVINN)
Dataset Aggregation (DAgger)
Generative Adversarial Imitation Learning (GAIL)
Apprenticeship Learning (AppL)
Guided Cost Learning (GCL)

The following reinforcement learning (RL) algorithms are supported:

The following OpenAI Gym Environments are supported. Environments with:

Observation space: Box(x,) and Box(x,y,z)
Action space: Discrete(x) and Box(x,)

NOTE: To create your own custom environment, just follow the OpenAI Gym Environment interface. i.e., your environment must implement the following methods (and inherit from the OpenAI Gym Class). More detailed instructions can be found on the OpenAI GitHub repository page on creating custom Gym environments.

Benchmarks

Sample train and test results of the baseline algorithms on some environments:

	CartPole-v0	MountainCar-v0	MountainCarContinuous-v0	LunarLander-v2	LunarLanderContinuous-v2
Threshold	200	-110	90	200	200
Expert (Mean/Std)	200.00 / 0.00	-98.71 / 7.83	93.36 / 0.05	268.09 / 21.18	283.83 / 17.70
BC (Mean/Std)	200.00 / 0.00	-100.800 / 13.797	93.353 / 0.113	244.295 / 97.765	285.895 / 14.584
DAgger (Mean/Std)	200.00 / 0.00	-102.36 / 15.38	93.20 / 0.17	230.15 / 122.604	285.85 / 14.61
GAIL (Mean/Std)	200.00 / 0.00	-104.31 / 17.21	79.78 / 6.23	201.88 / 93.82	282.00 / 31.73
GCL	200.00 / 0.00	-	-	212.321 / 119.933	255.414 / 76.917
AppL(Mean/Std)	200.00 / 0.00	-108.60 / 22.843	-	-	-
DQN (Mean/Std)	-	-	-	281.96 / 24.57	-
A2C (Mean/Std)	-		-	201.26 / 62.52	-
PPO (Mean/Std)	-	-	-	249.72 / 75.05	-

The pre-trained weights for these models can be found in our Model Zoo.

Citation

If you use ilpyt for your work, please cite our white paper:

@misc{ilpyt_2021,
  author = {Vu, Amanda and Tapley, Alex and Bissey, Brett},
  title = {ilpyt: Imitation Learning Research Code Base in PyTorch},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/mitre/ilpyt}},
}

ilpyt: imitation learning library with modular, baseline implementations in Pytorch

Related tags

Overview

ilpyt

Table of Contents

Main Features

Installation

Getting Started

Basic Usage

Code Organization

Customization

Supported Algorithms and Environments

Benchmarks

Citation

Owner

The MITRE Corporation

Christmas face app for Decathlon xmas coding party!

Facial Action Unit Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution

Source code for "Interactive All-Hex Meshing via Cuboid Decomposition [SIGGRAPH Asia 2021]".

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

[ICCV'2021] "SSH: A Self-Supervised Framework for Image Harmonization", Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang

Anchor Retouching via Model Interaction for Robust Object Detection in Aerial Images

Memory Defense: More Robust Classificationvia a Memory-Masking Autoencoder

A object detecting neural network powered by the yolo architecture and leveraging the PyTorch framework and associated libraries.

MVSDF - Learning Signed Distance Field for Multi-view Surface Reconstruction

Deep Learning to Create StepMania SM FIles

Code for "Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification", ECCV 2020 Spotlight

An AI made using artificial intelligence (AI) and machine learning algorithms (ML) .

Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

Deep Two-View Structure-from-Motion Revisited

Mesh TensorFlow: Model Parallelism Made Easier

Fast mesh denoising with data driven normal filtering using deep variational autoencoders

A deep neural networks for images using CNN algorithm.

Multi-Task Deep Neural Networks for Natural Language Understanding

A JAX-based research framework for writing differentiable numerical simulators with arbitrary discretizations

Official implementation of "Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks", NeurIPS 2021.