ilpyt: imitation learning library with modular, baseline implementations in Pytorch

Overview

ilpyt

The imitation learning toolbox (ilpyt) contains modular implementations of common deep imitation learning algorithms in PyTorch, with unified infrastructure supporting key imitation learning and reinforcement learning algorithms. You can read more about ilpyt in our white paper.

Documentation is available here.

Table of Contents

Main Features

  • Implementation of baseline imitation learning algorithms: BC, DAgger, AppL, GCL, GAIL.
  • Implementation of baseline reinforcement learning algorithms, for comparison purposes: DQN, A2C, PPO2.
  • Modular, extensible framework for training, evaluating, and testing imitation learning (and reinforcement learning) algorithms.
  • Simple algorithm API which exposes train and test methods, allowing for quick library setup and use (a basic usage of the library requires less than ten lines of code to have a fully functioning train and test pipeline).
  • A modular infrastructure for easy modification and reuse of existing components for novel algorithm implementations.
  • Parallel and serialization modes, allowing for faster, optimized operations or serial operations for debugging.
  • Compatibility with the OpenAI Gym environment interface for access to many existing benchmark learning environments, as well as the flexibility to create custom environments.

Installation

Note: ilpyt has only been tested on Ubuntu 20.04, and with Python 3.8.5.

  1. In order to install ilpyt, there are a few prerequisites required. The following commands will setup all the basics so you can run ilpyt with the OpenAI Gym environments:
# Install system-based packages
apt-get install cmake python3-pip python3-testresources freeglut3-dev xvfb

# Install Wheel
pip3 install --no-cache-dir --no-warn-script-location wheel
  1. Install ilpyt using pip:
pip3 install ilpyt

# Or to install from source:
# pip3 install -e .
  1. (Optional) Run the associated Python tests to confirm the package has installed successfully:
git clone https://github.com/mitre/ilpyt.git
cd ilpyt/

# To run all the tests
# If running headless, prepend the pytest command with `xvfb-run -a -s "-screen 0 1400x900x24 +extension RANDR" --`
pytest tests/

# Example: to run an individual test, like DQN
pytest tests/test_dqn.py 

Getting Started

Various sample Python script(s) of how to run the toolbox can be found within the examples directory. Documentation is available here.

Basic Usage

Various sample Python script(s) of how to run the toolbox can be found within the examples directory. A minimal train and test snippet for an imitation learning algorithm takes less than 10 lines of code in ilpyt. In this basic example, we are training a behavioral cloning algorithm for 10,000 epochs before testing the best policy for 100 episodes.

import ilpyt
from ilpyt.agents.imitation_agent import ImitationAgent
from ilpyt.algos.bc import BC

env = ilpyt.envs.build_env(env_id='LunarLander-v2',  num_env=16)
net = ilpyt.nets.choose_net(env)
agent = ImitationAgent(net=net, lr=0.0001)

algo = BC(agent=agent, env=env)
algo.train(num_epochs=10000, expert_demos='demos/LunarLander-v2/demos.pkl')
algo.test(num_episodes=100)

Code Organization

workflow

At a high-level, the algorithm orchestrates the training and testing of our agent in a particular environment. During these training or testing loops, a runner will execute the agent and environment in a loop to collect (state, action, reward, next state) transitions. The individual components of a transition (e.g., state or action) are typically torch Tensors. The agent can then use this batch of transitions to update its network and move towards an optimal action policy.

Customization

To implement a new algorithm, one simply has to extend the BaseAlgorithm and BaseAgent abstract classes (for even further customization, one can even make custom networks by extending the BaseNetwork interface). Each of these components is modular (see code organization for more details), allowing components to be easily swapped out. (For example, the agent.generator used in the GAIL algorithm can be easily swapped between PPOAgent, DQNAgent, or A2Cagent. In a similar way, new algorithm implementations can utilize existing implemented classes as building blocks, or extend the class interfaces for more customization.)

Adding a custom environment is as simple as extending the OpenAI Gym Environment interface and registering it within your local gym environment registry.

See agents/base_agent.py, algos/base_algo.py, nets/base_net.py for more details.

Supported Algorithms and Environments

The following imitation learning (IL) algorithms are supported:

The following reinforcement learning (RL) algorithms are supported:

The following OpenAI Gym Environments are supported. Environments with:

  • Observation space: Box(x,) and Box(x,y,z)
  • Action space: Discrete(x) and Box(x,)

NOTE: To create your own custom environment, just follow the OpenAI Gym Environment interface. i.e., your environment must implement the following methods (and inherit from the OpenAI Gym Class). More detailed instructions can be found on the OpenAI GitHub repository page on creating custom Gym environments.

Benchmarks

Sample train and test results of the baseline algorithms on some environments:

CartPole-v0 MountainCar-v0 MountainCarContinuous-v0 LunarLander-v2 LunarLanderContinuous-v2
Threshold 200 -110 90 200 200
Expert (Mean/Std) 200.00 / 0.00 -98.71 / 7.83 93.36 / 0.05 268.09 / 21.18 283.83 / 17.70
BC (Mean/Std) 200.00 / 0.00 -100.800 / 13.797 93.353 / 0.113 244.295 / 97.765 285.895 / 14.584
DAgger (Mean/Std) 200.00 / 0.00 -102.36 / 15.38 93.20 / 0.17 230.15 / 122.604 285.85 / 14.61
GAIL (Mean/Std) 200.00 / 0.00 -104.31 / 17.21 79.78 / 6.23 201.88 / 93.82 282.00 / 31.73
GCL 200.00 / 0.00 - - 212.321 / 119.933 255.414 / 76.917
AppL(Mean/Std) 200.00 / 0.00 -108.60 / 22.843 - - -
DQN (Mean/Std) - - - 281.96 / 24.57 -
A2C (Mean/Std) - - 201.26 / 62.52 -
PPO (Mean/Std) - - - 249.72 / 75.05 -

The pre-trained weights for these models can be found in our Model Zoo.

Citation

If you use ilpyt for your work, please cite our white paper:

@misc{ilpyt_2021,
  author = {Vu, Amanda and Tapley, Alex and Bissey, Brett},
  title = {ilpyt: Imitation Learning Research Code Base in PyTorch},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/mitre/ilpyt}},
}
Owner
The MITRE Corporation
Open Source Software from the MITRE Corporation
The MITRE Corporation
Tgbox-bench - Simple TGBOX upload speed benchmark

TGBOX Benchmark This script will benchmark upload speed to TGBOX storage. Build

Non 1 Jan 09, 2022
Charsiu: A transformer-based phonetic aligner

Charsiu: A transformer-based phonetic aligner [arXiv] Note. This is a preview version. The aligner is under active development. New functions, new lan

jzhu 166 Dec 09, 2022
Convert Table data to approximate values with GUI

Table_Editor Convert Table data to approximate values with GUIs... usage - Import methods for extension Tables. Imported method supposed to have only

CLJ 1 Jan 10, 2022
DeepLearning Anomalies Detection with Bluetooth Sensor Data

Final Year Project. Constructing models to create offline anomalies detection using Travel Time Data collected from Bluetooth sensors along the route.

1 Jan 10, 2022
PyoMyo - Python Opensource Myo library

PyoMyo Python module for the Thalmic Labs Myo armband. Cross platform and multithreaded and works without the Myo SDK. pip install pyomyo Documentati

PerlinWarp 81 Jan 08, 2023
Bayesian optimization in PyTorch

BoTorch is a library for Bayesian Optimization built on PyTorch. BoTorch is currently in beta and under active development! Why BoTorch ? BoTorch Prov

2.5k Dec 31, 2022
GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs

GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs [Paper, Slides, Video Talk] at USENIX OSDI'21 @inproceedings{GNNAdvisor, title=

YUKE WANG 47 Jan 03, 2023
The most simple and minimalistic navigation dashboard.

Navigation This project follows a goal to have simple and lightweight dashboard with different links. I use it to have my own self-hosted service dash

Yaroslav 23 Dec 23, 2022
The PyTorch implementation for paper "Neural Texture Extraction and Distribution for Controllable Person Image Synthesis" (CVPR2022 Oral)

ArXiv | Get Start Neural-Texture-Extraction-Distribution The PyTorch implementation for our paper "Neural Texture Extraction and Distribution for Cont

Ren Yurui 111 Dec 10, 2022
On the model-based stochastic value gradient for continuous reinforcement learning

On the model-based stochastic value gradient for continuous reinforcement learning This repository is by Brandon Amos, Samuel Stanton, Denis Yarats, a

Facebook Research 46 Dec 15, 2022
The final project of "Applying AI to 3D Medical Imaging Data" from "AI for Healthcare" nanodegree - Udacity.

Quantifying Hippocampus Volume for Alzheimer's Progression Background Alzheimer's disease (AD) is a progressive neurodegenerative disorder that result

Omar Laham 1 Jan 14, 2022
Mall-Customers-Segmentation - Customer Segmentation Using K-Means Clustering

Overview Customer Segmentation is one the most important applications of unsupervised learning. Using clustering techniques, companies can identify th

NelakurthiSudheer 2 Jan 03, 2022
Learnable Boundary Guided Adversarial Training (ICCV2021)

Learnable Boundary Guided Adversarial Training This repository contains the implementation code for the ICCV2021 paper: Learnable Boundary Guided Adve

DV Lab 27 Sep 25, 2022
[NeurIPS 2020] Code for the paper "Balanced Meta-Softmax for Long-Tailed Visual Recognition"

Balanced Meta-Softmax Code for the paper Balanced Meta-Softmax for Long-Tailed Visual Recognition Jiawei Ren, Cunjun Yu, Shunan Sheng, Xiao Ma, Haiyu

Jiawei Ren 65 Dec 21, 2022
Blender Add-on that sets a Material's Base Color to one of Pantone's Colors of the Year

Blender PCOY (Pantone Color of the Year) MCMC (Mid-Century Modern Colors) HG71 (House & Garden Colors 1971) Blender Add-ons That Assign a Custom Color

Don Schnitzius 15 Nov 20, 2022
Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

2.7k Jan 05, 2023
YOLOX-RMPOLY

本算法为适应robomaster比赛,而改动自矩形识别的yolox算法。 基于旷视科技YOLOX,实现对不规则四边形的目标检测 TODO 修改onnx推理模型 更改/添加标注: 1.yolox/models/yolox_polyhead.py: 1.1继承yolox/models/yolo_

3 Feb 25, 2022
I will implement Fastai in each projects present in this repository.

DEEP LEARNING FOR CODERS WITH FASTAI AND PYTORCH The repository contains a list of the projects which I have worked on while reading the book Deep Lea

Thinam Tamang 43 Dec 20, 2022
[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Counterfactual Attention Learning Created by Yongming Rao*, Guangyi Chen*, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for ICCV

Yongming Rao 90 Dec 31, 2022
Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt. This is done by

Mehdi Cherti 135 Dec 30, 2022