ilpyt: imitation learning library with modular, baseline implementations in Pytorch

Overview

ilpyt

The imitation learning toolbox (ilpyt) contains modular implementations of common deep imitation learning algorithms in PyTorch, with unified infrastructure supporting key imitation learning and reinforcement learning algorithms. You can read more about ilpyt in our white paper.

Documentation is available here.

Table of Contents

Main Features

  • Implementation of baseline imitation learning algorithms: BC, DAgger, AppL, GCL, GAIL.
  • Implementation of baseline reinforcement learning algorithms, for comparison purposes: DQN, A2C, PPO2.
  • Modular, extensible framework for training, evaluating, and testing imitation learning (and reinforcement learning) algorithms.
  • Simple algorithm API which exposes train and test methods, allowing for quick library setup and use (a basic usage of the library requires less than ten lines of code to have a fully functioning train and test pipeline).
  • A modular infrastructure for easy modification and reuse of existing components for novel algorithm implementations.
  • Parallel and serialization modes, allowing for faster, optimized operations or serial operations for debugging.
  • Compatibility with the OpenAI Gym environment interface for access to many existing benchmark learning environments, as well as the flexibility to create custom environments.

Installation

Note: ilpyt has only been tested on Ubuntu 20.04, and with Python 3.8.5.

  1. In order to install ilpyt, there are a few prerequisites required. The following commands will setup all the basics so you can run ilpyt with the OpenAI Gym environments:
# Install system-based packages
apt-get install cmake python3-pip python3-testresources freeglut3-dev xvfb

# Install Wheel
pip3 install --no-cache-dir --no-warn-script-location wheel
  1. Install ilpyt using pip:
pip3 install ilpyt

# Or to install from source:
# pip3 install -e .
  1. (Optional) Run the associated Python tests to confirm the package has installed successfully:
git clone https://github.com/mitre/ilpyt.git
cd ilpyt/

# To run all the tests
# If running headless, prepend the pytest command with `xvfb-run -a -s "-screen 0 1400x900x24 +extension RANDR" --`
pytest tests/

# Example: to run an individual test, like DQN
pytest tests/test_dqn.py 

Getting Started

Various sample Python script(s) of how to run the toolbox can be found within the examples directory. Documentation is available here.

Basic Usage

Various sample Python script(s) of how to run the toolbox can be found within the examples directory. A minimal train and test snippet for an imitation learning algorithm takes less than 10 lines of code in ilpyt. In this basic example, we are training a behavioral cloning algorithm for 10,000 epochs before testing the best policy for 100 episodes.

import ilpyt
from ilpyt.agents.imitation_agent import ImitationAgent
from ilpyt.algos.bc import BC

env = ilpyt.envs.build_env(env_id='LunarLander-v2',  num_env=16)
net = ilpyt.nets.choose_net(env)
agent = ImitationAgent(net=net, lr=0.0001)

algo = BC(agent=agent, env=env)
algo.train(num_epochs=10000, expert_demos='demos/LunarLander-v2/demos.pkl')
algo.test(num_episodes=100)

Code Organization

workflow

At a high-level, the algorithm orchestrates the training and testing of our agent in a particular environment. During these training or testing loops, a runner will execute the agent and environment in a loop to collect (state, action, reward, next state) transitions. The individual components of a transition (e.g., state or action) are typically torch Tensors. The agent can then use this batch of transitions to update its network and move towards an optimal action policy.

Customization

To implement a new algorithm, one simply has to extend the BaseAlgorithm and BaseAgent abstract classes (for even further customization, one can even make custom networks by extending the BaseNetwork interface). Each of these components is modular (see code organization for more details), allowing components to be easily swapped out. (For example, the agent.generator used in the GAIL algorithm can be easily swapped between PPOAgent, DQNAgent, or A2Cagent. In a similar way, new algorithm implementations can utilize existing implemented classes as building blocks, or extend the class interfaces for more customization.)

Adding a custom environment is as simple as extending the OpenAI Gym Environment interface and registering it within your local gym environment registry.

See agents/base_agent.py, algos/base_algo.py, nets/base_net.py for more details.

Supported Algorithms and Environments

The following imitation learning (IL) algorithms are supported:

The following reinforcement learning (RL) algorithms are supported:

The following OpenAI Gym Environments are supported. Environments with:

  • Observation space: Box(x,) and Box(x,y,z)
  • Action space: Discrete(x) and Box(x,)

NOTE: To create your own custom environment, just follow the OpenAI Gym Environment interface. i.e., your environment must implement the following methods (and inherit from the OpenAI Gym Class). More detailed instructions can be found on the OpenAI GitHub repository page on creating custom Gym environments.

Benchmarks

Sample train and test results of the baseline algorithms on some environments:

CartPole-v0 MountainCar-v0 MountainCarContinuous-v0 LunarLander-v2 LunarLanderContinuous-v2
Threshold 200 -110 90 200 200
Expert (Mean/Std) 200.00 / 0.00 -98.71 / 7.83 93.36 / 0.05 268.09 / 21.18 283.83 / 17.70
BC (Mean/Std) 200.00 / 0.00 -100.800 / 13.797 93.353 / 0.113 244.295 / 97.765 285.895 / 14.584
DAgger (Mean/Std) 200.00 / 0.00 -102.36 / 15.38 93.20 / 0.17 230.15 / 122.604 285.85 / 14.61
GAIL (Mean/Std) 200.00 / 0.00 -104.31 / 17.21 79.78 / 6.23 201.88 / 93.82 282.00 / 31.73
GCL 200.00 / 0.00 - - 212.321 / 119.933 255.414 / 76.917
AppL(Mean/Std) 200.00 / 0.00 -108.60 / 22.843 - - -
DQN (Mean/Std) - - - 281.96 / 24.57 -
A2C (Mean/Std) - - 201.26 / 62.52 -
PPO (Mean/Std) - - - 249.72 / 75.05 -

The pre-trained weights for these models can be found in our Model Zoo.

Citation

If you use ilpyt for your work, please cite our white paper:

@misc{ilpyt_2021,
  author = {Vu, Amanda and Tapley, Alex and Bissey, Brett},
  title = {ilpyt: Imitation Learning Research Code Base in PyTorch},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/mitre/ilpyt}},
}
Owner
The MITRE Corporation
Open Source Software from the MITRE Corporation
The MITRE Corporation
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples (WACV 2022) and Beyond Simple Meta-Learning: Multi-Purpose Model

PLAI Group at UBC 42 Dec 06, 2022
The code for Expectation-Maximization Attention Networks for Semantic Segmentation (ICCV'2019 Oral)

EMANet News The bug in loading the pretrained model is now fixed. I have updated the .pth. To use it, download it again. EMANet-101 gets 80.99 on the

Xia Li 李夏 663 Nov 30, 2022
A curated list of the top 10 computer vision papers in 2021 with video demos, articles, code and paper reference.

The Top 10 Computer Vision Papers of 2021 The top 10 computer vision papers in 2021 with video demos, articles, code, and paper reference. While the w

Louis-François Bouchard 118 Dec 21, 2022
nn_builder lets you build neural networks with less boilerplate code

nn_builder lets you build neural networks with less boilerplate code. You specify the type of network you want and it builds it. Install pip install n

Petros Christodoulou 157 Nov 20, 2022
Pywonderland - A tour in the wonderland of math with python.

A Tour in the Wonderland of Math with Python A collection of python scripts for drawing beautiful figures and animating interesting algorithms in math

Zhao Liang 4.1k Jan 03, 2023
YOLOX-RMPOLY

本算法为适应robomaster比赛,而改动自矩形识别的yolox算法。 基于旷视科技YOLOX,实现对不规则四边形的目标检测 TODO 修改onnx推理模型 更改/添加标注: 1.yolox/models/yolox_polyhead.py: 1.1继承yolox/models/yolo_

3 Feb 25, 2022
Code for the Lovász-Softmax loss (CVPR 2018)

The Lovász-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks Maxim Berman, Amal Ranne

Maxim Berman 1.3k Jan 04, 2023
NR-GAN: Noise Robust Generative Adversarial Networks

Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter Code and checkpoints for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling

Takuhiro Kaneko 59 Dec 11, 2022
Model Zoo for MindSpore

Welcome to the Model Zoo for MindSpore In order to facilitate developers to enjoy the benefits of MindSpore framework, we will continue to add typical

MindSpore 226 Jan 07, 2023
Implementation of paper "Towards a Unified View of Parameter-Efficient Transfer Learning"

A Unified Framework for Parameter-Efficient Transfer Learning This is the official implementation of the paper: Towards a Unified View of Parameter-Ef

Junxian He 216 Dec 29, 2022
DeLighT: Very Deep and Light-Weight Transformers

DeLighT: Very Deep and Light-weight Transformers This repository contains the source code of our work on building efficient sequence models: DeFINE (I

Sachin Mehta 440 Dec 18, 2022
O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis

O-CNN This repository contains the implementation of our papers related with O-CNN. The code is released under the MIT license. O-CNN: Octree-based Co

Microsoft 607 Dec 28, 2022
Identifying Stroke Indicators Using Rough Sets

Identifying Stroke Indicators Using Rough Sets With the spirit of reproducible research, this repository contains all the codes required to produce th

Muhammad Salman Pathan 0 Jun 09, 2022
A PyTorch-based library for fast prototyping and sharing of deep neural network models.

A PyTorch-based library for fast prototyping and sharing of deep neural network models.

78 Jan 03, 2023
Unofficial Implementation of RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series (AAAI 2019)

RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series (AAAI 2019) This repository contains python (3.5.2) implementation of

Doyup Lee 222 Dec 21, 2022
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

107 Dec 02, 2022
TensorFlow implementation of ENet, trained on the Cityscapes dataset.

segmentation TensorFlow implementation of ENet (https://arxiv.org/pdf/1606.02147.pdf) based on the official Torch implementation (https://github.com/e

Fredrik Gustafsson 248 Dec 16, 2022
A fast Evolution Strategy implementation in Python

Evostra: Evolution Strategy for Python Evolution Strategy (ES) is an optimization technique based on ideas of adaptation and evolution. You can learn

Mika 251 Dec 08, 2022
Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image

Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image (Project page) Zhengqin Li, Mohammad Sha

209 Jan 05, 2023
Bayesian Image Reconstruction using Deep Generative Models

Bayesian Image Reconstruction using Deep Generative Models R. Marinescu, D. Moyer, P. Golland For technical inquiries, please create a Github issue. F

Razvan Valentin Marinescu 51 Nov 23, 2022