ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

Last update: Dec 28, 2022

Related tags

Overview

Status: Under development (expect bug fixes and huge updates)

ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

ShinRL is an open-source JAX library specialized for the evaluation of reinforcement learning (RL) algorithms from both theoretical and practical perspectives. Please take a look at the paper for details.

QuickStart

Try ShinRL at: experiments/QuickStart.ipynb.

import gym
from shinrl import DiscreteViSolver
import matplotlib.pyplot as plt

# make an env & a config
env = gym.make("ShinPendulum-v0")
config = DiscreteViSolver.DefaultConfig(explore="eps_greedy", approx="nn", steps_per_epoch=10000)

# make mixins
mixins = DiscreteViSolver.make_mixins(env, config)
# mixins == [DeepRlStepMixIn, QTargetMixIn, TbInitMixIn, NetActMixIn, NetInitMixIn, ShinExploreMixIn, ShinEvalMixIn, DiscreteViSolver]

# (optional) arrange mixins
# mixins.insert(2, UserDefinedMixIn)

# make & run a solver
dqn_solver = DiscreteViSolver.factory(env, config, mixins)
dqn_solver.run()

# plot performance
returns = dqn_solver.scalars["Return"]
plt.plot(returns["x"], returns["y"])

# plot learned q-values  (act == 0)
q0 = dqn_solver.tb_dict["Q"][:, 0]
env.plot_S(q0, title="Learned")

# plot oracle q-values  (act == 0)
q0 = env.calc_q(dqn_solver.tb_dict["ExploitPolicy"])[:, 0]
env.plot_S(q0, title="Oracle")

# plot optimal q-values  (act == 0)
q0 = env.calc_optimal_q()[:, 0]
env.plot_S(q0, title="Optimal")

⚡ Key Modules

ShinRL consists of two main modules:

ShinEnv: Implement relatively small MDP environments with access to the oracle quantities.
Solver: Solve the environments (e.g., finding the optimal policy) with specified algorithms.

🔬 ShinEnv for Oracle Analysis

ShinEnv provides small environments with oracle methods that can compute exact quantities:
- calc_q computes a Q-value table containing all possible state-action pairs given a policy.
- calc_optimal_q computes the optimal Q-value table.
- calc_visit calculates state visitation frequency table, for a given policy.
- calc_return is a shortcut for computing exact undiscounted returns for a given policy.
Some environments support continuous action space and image observation. See the following table and shinrl/envs/__init__.py for the available environments.

Environment	Dicrete action	Continuous action	Image Observation	Tuple Observation
ShinMaze	✔️	❌	❌	✔️
ShinMountainCar-v0	✔️	✔️	✔️	✔️
ShinPendulum-v0	✔️	✔️	✔️	✔️
ShinCartPole-v0	✔️	✔️	❌	✔️

🏭 Flexible Solver by MixIn

A "mixin" is a class which defines and implements a single feature. ShinRL's solvers are instantiated by mixing some mixins.
By arranging mixins, you can easily implement your own idea on the ShinRL's code base. See experiments/QuickStart.ipynb for example.
The following code demonstrates how different mixins turn into "value iteration" and "deep Q learning":

import gym
from shinrl import DiscreteViSolver

env = gym.make("ShinPendulum-v0")

# run value iteration (dynamic programming)
config = DiscreteViSolver.DefaultConfig(approx="tabular", explore="oracle")
mixins = DiscreteViSolver.make_mixins(env, config)
# mixins == [TabularDpStepMixIn, QTargetMixIn, TbInitMixIn, ShinExploreMixIn, ShinEvalMixIn, DiscreteViSolver]
vi_solver = DiscreteViSolver.factory(env, config, mixins)
vi_solver.run()

# run deep Q learning 
config = DiscreteViSolver.DefaultConfig(approx="nn", explore="eps_greedy")
mixins = DiscreteViSolver.make_mixins(env, config)  
# mixins == [DeepRlStepMixIn, QTargetMixIn, TbInitMixIn, NetActMixIn, NetInitMixIn, ShinExploreMixIn, ShinEvalMixIn, DiscreteViSolver]
dql_solver = DiscreteViSolver.factory(env, config, mixins)
dql_solver.run()

# ShinRL also provides deep RL solvers with OpenAI Gym environment supports.
env = gym.make("CartPole-v0")
mixins = DiscreteViSolver.make_mixins(env, config)  
# mixins == [DeepRlStepMixIn, QTargetMixIn, TargetMixIn, NetActMixIn, NetInitMixIn, GymExploreMixIn, GymEvalMixIn, DiscreteViSolver]
dql_solver = DiscreteViSolver.factory(env, config, mixins)
dql_solver.run()

Installation

git clone [email protected]:omron-sinicx/ShinRL.git
cd ShinRL
pip install -e .

Test

cd ShinRL
make test

Format

cd ShinRL
make format

Docker

cd ShinRL
docker-compose up

Citation

# Neurips DRL WS 2021 version
@inproceedings{toshinori2021shinrl,
    author = {Kitamura, Toshinori and Yonetani, Ryo},
    title = {ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives},
    year = {2021},
    booktitle = {Proceedings of the NeurIPS Deep RL Workshop},
}

# Arxiv version
@article{toshinori2021shinrlArxiv,
    author = {Kitamura, Toshinori and Yonetani, Ryo},
    title = {ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives},
    year = {2021},
    url = {https://arxiv.org/abs/2112.04123},
    journal={arXiv preprint arXiv:2112.04123},
}

ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

Related tags

Overview

ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

QuickStart

⚡ Key Modules

🔬 ShinEnv for Oracle Analysis

🏭 Flexible Solver by MixIn

Installation

Test

Format

Docker

Citation

Owner

CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator

Rafael Project- Classifying rockets to different types using data science algorithms.

Medical-Image-Triage-and-Classification-System-Based-on-COVID-19-CT-and-X-ray-Scan-Dataset

Cross-platform CLI tool to generate your Github profile's stats and summary.

TensorFlow implementation of "Variational Inference with Normalizing Flows"

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

This is the official PyTorch implementation of our paper: "Artistic Style Transfer with Internal-external Learning and Contrastive Learning".

Avalanche RL: an End-to-End Library for Continual Reinforcement Learning

A object detecting neural network powered by the yolo architecture and leveraging the PyTorch framework and associated libraries.

Material del curso IIC2233 Programación Avanzada 📚

Add-on for importing and auto setup of character creator 3 character exports.

Pytorch library for fast transformer implementations

This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Static-test - A playground to play with ideas related to testing the comparability of the code

ONNX-GLPDepth - Python scripts for performing monocular depth estimation using the GLPDepth model in ONNX

Pyeventbus: a publish/subscribe event bus

PyTorch implementation of DeepUME: Learning the Universal Manifold Embedding for Robust Point Cloud Registration (BMVC 2021)

Implementation for the paper 'YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs'

ROS-UGV-Control-Interface - Control interface which can be used in any UGV

FID calculation with proper image resizing and quantization steps