RLDS stands for Reinforcement Learning Datasets

Related tags

Deep Learningrlds
Overview

RLDS

RLDS stands for Reinforcement Learning Datasets and it is an ecosystem of tools to store, retrieve and manipulate episodic data in the context of Sequential Decision Making including Reinforcement Learning (RL), Learning for Demonstrations, Offline RL or Imitation Learning.

This repository includes a library for manipulating RLDS compliant datasets. For other parts of the pipeline please refer to:

  • EnvLogger to create synthetic datasets
  • RLDS Creator to create datasets where a human interacts with an environment.
  • TFDS for existing RL datasets.

QuickStart & Colabs

See how to use RLDS in this tutorial.

You can find more examples in the following colabs:

Dataset Format

The dataset is retrieved as a tf.data.Dataset of Episodes where each episode contains a tf.data.Dataset of steps.

drawing

  • Episode: dictionary that contains a tf.data.Dataset of Steps, and metadata.

  • Step: dictionary that contains:

    • observation: current observation
    • action: action taken in the current observation
    • reward: return after appyling the action to the current observation
    • is_terminal: if this is a terminal step
    • is_first: if this is the first step of an episode that contains the initial state.
    • is_last: if this is the last step of an episode, that contains the last observation. When true, action, reward and discount, and other cutom fields subsequent to the observation are considered invalid.
    • discount: discount factor at this step.
    • extra metadata

    When is_terminal = True, the observation corresponds to a final state, so reward, discount and action are meaningless. Depending on the environment, the final observation may also be meaningless.

    If an episode ends in a step where is_terminal = False, it means that this episode has been truncated. In this case, depending on the environment, the action, reward and discount might be empty as well.

How to create a dataset

Although you can read datasets with the RLDS format even if they were not created with our tools (for example, by adding them to TFDS), we recommend the use of EnvLogger and RLDS Creator as they ensure that the data is stored in a lossless fashion and compatible with RLDS.

Synthetic datasets

Envlogger provides a dm_env Environment class wrapper that records interactions between a real environment and an agent.

env = envloger.EnvironmentLogger(
      environment,
      data_directory=`/tmp/mydataset`)

Besides, two callbacks can be passed to the EnviromentLogger constructor to store per-step metadata and per-episode metadata. See the EnvLogger documentation for more details.

Note that per-session metadata can be stored but is currently ignored when loading the dataset.

Note that the Envlogger follows the dm_env convention. So considering:

  • o_i: observation at step i
  • a_i: action applied to o_i
  • r_i: reward obtained when applying a_i in o_i
  • d_i: discount for reward r_i
  • m_i: metadata for step i

Data is generated and stored as:

    (o_0, _, _, _, m_0) → (o_1, a_0, r_0, d_0, m_1)  → (o_2, a_1, r_1, d_1, m_2) ⇢ ...

But loaded with RLDS as:

    (o_0,a_0, r_0, d_0, m_0) → (o_1, a_1, r_1, d_1, m_1)  → (o_2, a_2, r_2, d_2, m_2) ⇢ ...

Human datasets

If you want to collect data generated by a human interacting with an environment, check the RLDS Creator.

How to load a dataset

RL datasets can be loaded with TFDS and they are retrieved with the canonical RLDS dataset format.

See this section for instructions on how to add an RLDS dataset to TFDS.

Load with TFDS

Datasets in the TFDS catalog

These datasets can be loaded directly with:

tfds.load('dataset_name').as_dataset()['train']

This is how we load the datasets in the tutorial.

See the full documentation and the catalog in the [TFDS] site.

Datasets in your own repository

Datasets can be implemented with TFDS both inside and outside of the TFDS repository. See examples here.

How to add your dataset to TFDS

Adding a dataset to TFDS involves two steps:

  • Implement a python class that provides a dataset builder with the specs of the data (e.g., what is the shape of the observations, actions, etc.) and how to read your dataset files.

  • Run a download_and_prepare pipeline that converts the data to the TFDS intermediate format.

You can add your dataset directly to TFDS following the instructions at https://www.tensorflow.org/datasets.

  • If your data has been generated with Envlogger or the RLDS Creator, you can just use the rlds helpers in TFDS (see here an example).
  • Otherwise, make sure your generate_examples implementation provides the same structure and keys as RLDS loaders if you want your dataset to be compatible with RLDS pipelines (example).

Note that you can follow the same steps to add the data to your own repository (see more details in the TFDS documentation).

Performance best practices

As RLDS exposes RL datasets in a form of Tensorflow's tf.data, many Tensorflow's performance hints apply to RLDS as well. It is important to note, however, that RLDS datasets are very specific and not all general speed-up methods work out of the box. advices on improving performance might not result in expected outcome. To get a better understanding on how to use RLDS datasets effectively we recommend going through this colab.

Citation

If you use RLDS, please cite the RLDS paper as

@misc{ramos2021rlds,
      title={RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning},
      author={Sabela Ramos and Sertan Girgin and Léonard Hussenot and Damien Vincent and Hanna Yakubovich and Daniel Toyama and Anita Gergely and Piotr Stanczyk and Raphael Marinier and Jeremiah Harmsen and Olivier Pietquin and Nikola Momchev},
      year={2021},
      eprint={2111.02767},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Acknowledgements

We greatly appreciate all the support from the TF-Agents team in setting up building and testing for EnvLogger.

Disclaimer

This is not an officially supported Google product.

Owner
Google Research
Google Research
Code for generating the figures in the paper "Capacity of Group-invariant Linear Readouts from Equivariant Representations: How Many Objects can be Linearly Classified Under All Possible Views?"

Code for running simulations for the paper "Capacity of Group-invariant Linear Readouts from Equivariant Representations: How Many Objects can be Lin

Matthew Farrell 1 Nov 22, 2022
PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

Smooth ReLU in PyTorch Unofficial PyTorch reimplementation of the Smooth ReLU (SmeLU) activation function proposed in the paper Real World Large Scale

Christoph Reich 10 Jan 02, 2023
Deploy a ML inference service on a budget in less than 10 lines of code.

BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end.

1.3k Dec 25, 2022
An implementation for the ICCV 2021 paper Deep Permutation Equivariant Structure from Motion.

Deep Permutation Equivariant Structure from Motion Paper | Poster This repository contains an implementation for the ICCV 2021 paper Deep Permutation

72 Dec 27, 2022
Dynamical movement primitives (DMPs), probabilistic movement primitives (ProMPs), spatially coupled bimanual DMPs.

Movement Primitives Movement primitives are a common group of policy representations in robotics. There are many different types and variations. This

DFKI Robotics Innovation Center 63 Jan 06, 2023
PyTorch implementation of the ACL, 2021 paper Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks.

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks This repo contains the PyTorch implementation of the ACL, 2021 pa

Rabeeh Karimi Mahabadi 98 Dec 28, 2022
In this project we combine techniques from neural voice cloning and musical instrument synthesis to achieve good results from as little as 16 seconds of target data.

Neural Instrument Cloning In this project we combine techniques from neural voice cloning and musical instrument synthesis to achieve good results fro

Erland 127 Dec 23, 2022
Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization

Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization Code for reproducing our results in the Head2Toe paper. Paper: arxiv.or

Google Research 62 Dec 12, 2022
GPU Accelerated Non-rigid ICP for surface registration

GPU Accelerated Non-rigid ICP for surface registration Introduction Preivous Non-rigid ICP algorithm is usually implemented on CPU, and needs to solve

Haozhe Wu 144 Jan 04, 2023
Make differentially private training of transformers easy for everyone

private-transformers This codebase facilitates fast experimentation of differentially private training of Hugging Face transformers. What is this? Why

Xuechen Li 73 Dec 28, 2022
🧠 A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation.', ECCV 2016

Deep CORAL A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation. B Sun, K Saenko, ECCV 2016' Deep CORAL can learn

Andy Hsu 200 Dec 25, 2022
TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning

TransZero++ This repository contains the testing code for the paper "TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning" submitted

Shiming Chen 6 Aug 16, 2022
An implementation on "Curved-Voxel Clustering for Accurate Segmentation of 3D LiDAR Point Clouds with Real-Time Performance"

Lidar-Segementation An implementation on "Curved-Voxel Clustering for Accurate Segmentation of 3D LiDAR Point Clouds with Real-Time Performance" from

Wangxu1996 135 Jan 06, 2023
Single Red Blood Cell Hydrodynamic Traps Via the Generative Design

Rbc-traps-generative-design - The generative design for single red clood cell hydrodynamic traps using GEFEST framework

Natural Systems Simulation Lab 4 Jun 16, 2022
Keras implementation of Normalizer-Free Networks and SGD - Adaptive Gradient Clipping

Keras implementation of Normalizer-Free Networks and SGD - Adaptive Gradient Clipping

Yam Peleg 63 Sep 21, 2022
PyTorch implementations of deep reinforcement learning algorithms and environments

Deep Reinforcement Learning Algorithms with PyTorch This repository contains PyTorch implementations of deep reinforcement learning algorithms and env

Petros Christodoulou 4.7k Jan 04, 2023
The official repository for "Intermediate Layers Matter in Momentum Contrastive Self Supervised Learning" paper.

Intermdiate layer matters - SSL The official repository for "Intermediate Layers Matter in Momentum Contrastive Self Supervised Learning" paper. Downl

Aakash Kaku 35 Sep 19, 2022
Applicator Kit for Modo allow you to apply Apple ARKit Face Tracking data from your iPhone or iPad to your characters in Modo.

Applicator Kit for Modo Applicator Kit for Modo allow you to apply Apple ARKit Face Tracking data from your iPhone or iPad with a TrueDepth camera to

Andrew Buttigieg 3 Aug 24, 2021
Demystifying How Self-Supervised Features Improve Training from Noisy Labels

Demystifying How Self-Supervised Features Improve Training from Noisy Labels This code is a PyTorch implementation of the paper "[Demystifying How Sel

<a href=[email protected]"> 4 Oct 14, 2022
Open-CyKG: An Open Cyber Threat Intelligence Knowledge Graph

Open-CyKG: An Open Cyber Threat Intelligence Knowledge Graph Model Description Open-CyKG is a framework that is constructed using an attenti

Injy Sarhan 34 Jan 05, 2023