OpenAi's gym environment wrapper to vectorize them with Ray

Overview

Ray Vector Environment Wrapper

You would like to use Ray to vectorize your environment but you don't want to use RLLib ?
You came to the right place !

This package allows you to parallelize your environment using Ray
Not only does it allows to run environments in parallel, but it also permits to run multiple sequential environments on each worker
For example, you can run 80 workers in parallel, each running 10 sequential environments for a total of 80 * 10 environments
This can be useful if your environment is fast and simply running 1 environment per worker leads to too much communication overhead between workers

Installation

pip install RayEnvWrapper

If something went wrong, it most certainly is because of Ray
For example, you might have issue installing Ray on Apple Silicon (i.e., M1) laptop. See Ray's documentation for a simple fix
At the moment Ray does not support Python 3.10. This package has been tested with Python 3.9.

How does it work?

You first need to define a function that seed and return your environment:

Here is an example for CartPole:

import gym

def make_and_seed(seed: int) -> gym.Env:
    env = gym.make('CartPole-v0')
    env = gym.wrappers.RecordEpisodeStatistics(env) # you can put extra wrapper to your original environment
    env.seed(seed)
    return env

Note: If you don't want to seed your environment, simply return it without using the seed, but the function you define needs to take a number as an input

Then, call the wrapper to create and wrap all the vectorized environment:

from RayEnvWrapper import WrapperRayVecEnv

number_of_workers = 4 # Usually, this is set to the number of CPUs in your machine
envs_per_worker = 2

vec_env = WrapperRayVecEnv(make_and_seed, number_of_workers, envs_per_worker)

You can then use your environment. All the output for each of the environments are stacked in a numpy array

Reset:

vec_env.reset()

Output

[[ 0.03073904  0.00145001 -0.03088818 -0.03131252]
 [ 0.03073904  0.00145001 -0.03088818 -0.03131252]
 [ 0.02281231 -0.02475473  0.02306162  0.02072129]
 [ 0.02281231 -0.02475473  0.02306162  0.02072129]
 [-0.03742824 -0.02316945  0.0148571   0.0296055 ]
 [-0.03742824 -0.02316945  0.0148571   0.0296055 ]
 [-0.0224773   0.04186813 -0.01038048  0.03759079]
 [-0.0224773   0.04186813 -0.01038048  0.03759079]]

The i-th entry represent the initial observation of the i-th environment
Note: As environments are vectorized, you don't need explicitly to reset the environment at the end of the episode, it is done automatically However, you need to do it once at the beginning

Take a random action:

vec_env.step([vec_env.action_space.sample() for _ in range(number_of_workers * envs_per_worker)])

Notice how the actions are passed. We pass an array containing an action for each of the environments
Thus, the array is of size number_of_workers * envs_per_worker (i.e., the total number of environments)

Output

(array([[ 0.03076804, -0.19321568, -0.03151444,  0.25146705],
       [ 0.03076804, -0.19321568, -0.03151444,  0.25146705],
       [ 0.02231721, -0.22019969,  0.02347605,  0.3205903 ],
       [ 0.02231721, -0.22019969,  0.02347605,  0.3205903 ],
       [-0.03789163, -0.21850128,  0.01544921,  0.32693872],
       [-0.03789163, -0.21850128,  0.01544921,  0.32693872],
       [-0.02163994, -0.15310344, -0.00962866,  0.3269806 ],
       [-0.02163994, -0.15310344, -0.00962866,  0.3269806 ]],
      dtype=float32), 
 array([1., 1., 1., 1., 1., 1., 1., 1.], dtype=float32), 
 array([False, False, False, False, False, False, False, False]), 
 [{}, {}, {}, {}, {}, {}, {}, {}])

As usual, the step method returns a tuple, except that here both the observation, reward, dones and infos are concatenated
In this specific example, we have 2 environments per worker.
Index 0 and 1 are environments from worker 1; index 1 and 2 are environments from worker 2, etc.

License

Apache License 2.0

You might also like...
A
A "gym" style toolkit for building lightweight Neural Architecture Search systems

A "gym" style toolkit for building lightweight Neural Architecture Search systems

Customizable RecSys Simulator for OpenAI Gym
Customizable RecSys Simulator for OpenAI Gym

gym-recsys: Customizable RecSys Simulator for OpenAI Gym Installation | How to use | Examples | Citation This package describes an OpenAI Gym interfac

Robot Servers and Server Manager software for robo-gym

robo-gym-server-modules Robot Servers and Server Manager software for robo-gym. For info on how to use this package please visit the robo-gym website

Deep Q Learning with OpenAI Gym and Pokemon Showdown

pokemon-deep-learning An openAI gym project for pokemon involving deep q learning. Made by myself, Sam Little, and Layton Webber. This code captures g

Manipulation OpenAI Gym environments to simulate robots at the STARS lab

Manipulator Learning This repository contains a set of manipulation environments that are compatible with OpenAI Gym and simulated in pybullet. In par

AI virtual gym is an AI program which can be used to exercise and can be used to see if we are doing the exercises

AI virtual gym is an AI program which can be used to exercise and can be used to see if we are doing the exercises

Multi-objective gym environments for reinforcement learning.
Multi-objective gym environments for reinforcement learning.

MO-Gym: Multi-Objective Reinforcement Learning Environments Gym environments for multi-objective reinforcement learning (MORL). The environments follo

Pytorch Lightning Distributed Accelerators using Ray

Distributed PyTorch Lightning Training on Ray This library adds new PyTorch Lightning accelerators for distributed training using the Ray distributed

Pytorch Lightning Distributed Accelerators using Ray

Distributed PyTorch Lightning Training on Ray This library adds new PyTorch Lightning plugins for distributed training using the Ray distributed compu

Comments
  • envs_per_worker

    envs_per_worker

    Hi!@ingambe. Thank you very much for your work! I have some questions. What does the "worker and envs" mean here? My understanding is as follows:

    • Worker represents a process. Two env in a worker belong to two threads.

    I don't know if I understand this correctly. Thanks! image

    opened by Meta-YZ 2
  • how to wrap two DIFFERENT environments?

    how to wrap two DIFFERENT environments?

    Thank you for upload the package. My question is is there a way to stack different environments together? For example I have ten or hundreds different race track environments and I want to train an agent simultaneously drive through this vectorized environment. In stable baseline I can stack them together and train a vectorized environment. Now I want to move to ray and try to speed up the training by using multiple gpu...but so far didn't figure out how to do this. Thanks in advance

    enhancement 
    opened by superfan123 1
Releases(v1.0)
Owner
Pierre TASSEL
Pierre TASSEL
IsoGCN code for ICLR2021

IsoGCN The official implementation of IsoGCN, presented in the ICLR2021 paper Isometric Transformation Invariant and Equivariant Graph Convolutional N

horiem 39 Nov 25, 2022
Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

Chord Recognition Demo application The demo application is written in C# with .NETCore. As of July 9, 2020, the only version available is for windows

Andres Mauricio Rondon Patiño 24 Oct 22, 2022
This is a library for training and applying sparse fine-tunings with torch and transformers.

This is a library for training and applying sparse fine-tunings with torch and transformers. Please refer to our paper Composable Sparse Fine-Tuning f

Cambridge Language Technology Lab 37 Dec 30, 2022
Finite-temperature variational Monte Carlo calculation of uniform electron gas using neural canonical transformation.

CoulombGas This code implements the neural canonical transformation approach to the thermodynamic properties of uniform electron gas. Building on JAX,

FermiFlow 9 Mar 03, 2022
Chunkmogrify: Real image inversion via Segments

Chunkmogrify: Real image inversion via Segments Teaser video with live editing sessions can be found here This code demonstrates the ideas discussed i

David Futschik 112 Jan 04, 2023
[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University) 842 Jan 04, 2023
Implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".

PRP Introduction This is the implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".

yuanyao366 39 Dec 29, 2022
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation This repository contains the source code of our paper, ESPNet (acc

Sachin Mehta 515 Dec 13, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

TUCH This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright License fo

Lea Müller 45 Jan 07, 2023
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

ActNN : Activation Compressed Training This is the official project repository for ActNN: Reducing Training Memory Footprint via 2-Bit Activation Comp

UC Berkeley RISE 178 Jan 05, 2023
PyTorch implementation for "Sharpness-aware Quantization for Deep Neural Networks".

Sharpness-aware Quantization for Deep Neural Networks This is the official repository for our paper: Sharpness-aware Quantization for Deep Neural Netw

Zhuang AI Group 30 Dec 19, 2022
Creating Artificial Life with Reinforcement Learning

Although Evolutionary Algorithms have shown to result in interesting behavior, they focus on learning across generations whereas behavior could also be learned during ones lifetime.

Maarten Grootendorst 49 Dec 21, 2022
Keyword spotting on Arm Cortex-M Microcontrollers

Keyword spotting for Microcontrollers This repository consists of the tensorflow models and training scripts used in the paper: Hello Edge: Keyword sp

Arm Software 1k Dec 30, 2022
Computer Vision Paper Reviews with Key Summary of paper, End to End Code Practice and Jupyter Notebook converted papers

Computer-Vision-Paper-Reviews Computer Vision Paper Reviews with Key Summary along Papers & Codes. Jonathan Choi 2021 The repository provides 100+ Pap

Jonathan Choi 2 Mar 17, 2022
Proto-RL: Reinforcement Learning with Prototypical Representations

Proto-RL: Reinforcement Learning with Prototypical Representations This is a PyTorch implementation of Proto-RL from Reinforcement Learning with Proto

Denis Yarats 74 Dec 06, 2022
The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation This repository is the official implementation of CVPR 2021 paper:

9 Nov 14, 2022
The best solution of the Weather Prediction track in the Yandex Shifts challenge

yandex-shifts-weather The repository contains information about my solution for the Weather Prediction track in the Yandex Shifts challenge https://re

Ivan Yu. Bondarenko 15 Dec 18, 2022
Face-Recognition-based-Attendance-System - An implementation of Attendance System in python.

Face-Recognition-based-Attendance-System A real time implementation of Attendance System in python. Pre-requisites To understand the implentation of F

Muhammad Zain Ul Haque 1 Dec 31, 2021
Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

GNN_PPI Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction". Lear

Ursa Zrimsek 2 Dec 14, 2022
Deep Networks with Recurrent Layer Aggregation

RLA-Net: Recurrent Layer Aggregation Recurrence along Depth: Deep Networks with Recurrent Layer Aggregation This is an implementation of RLA-Net (acce

Joy Fang 21 Aug 16, 2022