RLDS stands for Reinforcement Learning Datasets

Related tags

Deep Learningrlds
Overview

RLDS

RLDS stands for Reinforcement Learning Datasets and it is an ecosystem of tools to store, retrieve and manipulate episodic data in the context of Sequential Decision Making including Reinforcement Learning (RL), Learning for Demonstrations, Offline RL or Imitation Learning.

This repository includes a library for manipulating RLDS compliant datasets. For other parts of the pipeline please refer to:

  • EnvLogger to create synthetic datasets
  • RLDS Creator to create datasets where a human interacts with an environment.
  • TFDS for existing RL datasets.

QuickStart & Colabs

See how to use RLDS in this tutorial.

You can find more examples in the following colabs:

Dataset Format

The dataset is retrieved as a tf.data.Dataset of Episodes where each episode contains a tf.data.Dataset of steps.

drawing

  • Episode: dictionary that contains a tf.data.Dataset of Steps, and metadata.

  • Step: dictionary that contains:

    • observation: current observation
    • action: action taken in the current observation
    • reward: return after appyling the action to the current observation
    • is_terminal: if this is a terminal step
    • is_first: if this is the first step of an episode that contains the initial state.
    • is_last: if this is the last step of an episode, that contains the last observation. When true, action, reward and discount, and other cutom fields subsequent to the observation are considered invalid.
    • discount: discount factor at this step.
    • extra metadata

    When is_terminal = True, the observation corresponds to a final state, so reward, discount and action are meaningless. Depending on the environment, the final observation may also be meaningless.

    If an episode ends in a step where is_terminal = False, it means that this episode has been truncated. In this case, depending on the environment, the action, reward and discount might be empty as well.

How to create a dataset

Although you can read datasets with the RLDS format even if they were not created with our tools (for example, by adding them to TFDS), we recommend the use of EnvLogger and RLDS Creator as they ensure that the data is stored in a lossless fashion and compatible with RLDS.

Synthetic datasets

Envlogger provides a dm_env Environment class wrapper that records interactions between a real environment and an agent.

env = envloger.EnvironmentLogger(
      environment,
      data_directory=`/tmp/mydataset`)

Besides, two callbacks can be passed to the EnviromentLogger constructor to store per-step metadata and per-episode metadata. See the EnvLogger documentation for more details.

Note that per-session metadata can be stored but is currently ignored when loading the dataset.

Note that the Envlogger follows the dm_env convention. So considering:

  • o_i: observation at step i
  • a_i: action applied to o_i
  • r_i: reward obtained when applying a_i in o_i
  • d_i: discount for reward r_i
  • m_i: metadata for step i

Data is generated and stored as:

    (o_0, _, _, _, m_0) → (o_1, a_0, r_0, d_0, m_1)  → (o_2, a_1, r_1, d_1, m_2) ⇢ ...

But loaded with RLDS as:

    (o_0,a_0, r_0, d_0, m_0) → (o_1, a_1, r_1, d_1, m_1)  → (o_2, a_2, r_2, d_2, m_2) ⇢ ...

Human datasets

If you want to collect data generated by a human interacting with an environment, check the RLDS Creator.

How to load a dataset

RL datasets can be loaded with TFDS and they are retrieved with the canonical RLDS dataset format.

See this section for instructions on how to add an RLDS dataset to TFDS.

Load with TFDS

Datasets in the TFDS catalog

These datasets can be loaded directly with:

tfds.load('dataset_name').as_dataset()['train']

This is how we load the datasets in the tutorial.

See the full documentation and the catalog in the [TFDS] site.

Datasets in your own repository

Datasets can be implemented with TFDS both inside and outside of the TFDS repository. See examples here.

How to add your dataset to TFDS

Adding a dataset to TFDS involves two steps:

  • Implement a python class that provides a dataset builder with the specs of the data (e.g., what is the shape of the observations, actions, etc.) and how to read your dataset files.

  • Run a download_and_prepare pipeline that converts the data to the TFDS intermediate format.

You can add your dataset directly to TFDS following the instructions at https://www.tensorflow.org/datasets.

  • If your data has been generated with Envlogger or the RLDS Creator, you can just use the rlds helpers in TFDS (see here an example).
  • Otherwise, make sure your generate_examples implementation provides the same structure and keys as RLDS loaders if you want your dataset to be compatible with RLDS pipelines (example).

Note that you can follow the same steps to add the data to your own repository (see more details in the TFDS documentation).

Performance best practices

As RLDS exposes RL datasets in a form of Tensorflow's tf.data, many Tensorflow's performance hints apply to RLDS as well. It is important to note, however, that RLDS datasets are very specific and not all general speed-up methods work out of the box. advices on improving performance might not result in expected outcome. To get a better understanding on how to use RLDS datasets effectively we recommend going through this colab.

Citation

If you use RLDS, please cite the RLDS paper as

@misc{ramos2021rlds,
      title={RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning},
      author={Sabela Ramos and Sertan Girgin and Léonard Hussenot and Damien Vincent and Hanna Yakubovich and Daniel Toyama and Anita Gergely and Piotr Stanczyk and Raphael Marinier and Jeremiah Harmsen and Olivier Pietquin and Nikola Momchev},
      year={2021},
      eprint={2111.02767},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Acknowledgements

We greatly appreciate all the support from the TF-Agents team in setting up building and testing for EnvLogger.

Disclaimer

This is not an officially supported Google product.

Owner
Google Research
Google Research
StorSeismic: An approach to pre-train a neural network to store seismic data features

StorSeismic: An approach to pre-train a neural network to store seismic data features This repository contains codes and resources to reproduce experi

Seismic Wave Analysis Group 11 Dec 05, 2022
SberSwap Video Swap base on deep learning

SberSwap Video Swap base on deep learning

Sber AI 431 Jan 03, 2023
This is a repository with the code for the ACL 2019 paper

The Story of Heads This is the official repo for the following papers: (ACL 2019) Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy

231 Nov 15, 2022
PSML: A Multi-scale Time-series Dataset for Machine Learning in Decarbonized Energy Grids

PSML: A Multi-scale Time-series Dataset for Machine Learning in Decarbonized Energy Grids The electric grid is a key enabling infrastructure for the a

Texas A&M Engineering Research 19 Jan 07, 2023
Impelmentation for paper Feature Generation and Hypothesis Verification for Reliable Face Anti-Spoofing

FGHV Impelmentation for paper Feature Generation and Hypothesis Verification for Reliable Face Anti-Spoofing Requirements Python 3.6 Pytorch 1.5.0 Cud

5 Jun 02, 2022
Semantic segmentation models, datasets and losses implemented in PyTorch.

Semantic Segmentation in PyTorch Semantic Segmentation in PyTorch Requirements Main Features Models Datasets Losses Learning rate schedulers Data augm

Yassine 1.3k Jan 07, 2023
Automatic caption evaluation metric based on typicality analysis.

SeMantic and linguistic UndeRstanding Fusion (SMURF) Automatic caption evaluation metric described in the paper "SMURF: SeMantic and linguistic UndeRs

Joshua Feinglass 6 Jan 09, 2022
Pytorch implementation of Integrating Tree Path in Transformer for Code Representation

This is an official Pytorch implementation of the approaches proposed in: Han Peng, Ge Li, Wenhan Wang, Yunfei Zhao, Zhi Jin “Integrating Tree Path in

Han Peng 16 Dec 23, 2022
Segmentation-Aware Convolutional Networks Using Local Attention Masks

Segmentation-Aware Convolutional Networks Using Local Attention Masks [Project Page] [Paper] Segmentation-aware convolution filters are invariant to b

144 Jun 29, 2022
AI-Fitness-Tracker - AI Fitness Tracker With Python

AI-Fitness-Tracker We have build a AI based Fitness Tracker using OpenCV and Pyt

Sharvari Mangale 5 Feb 09, 2022
QHack—the quantum machine learning hackathon

Official repo for QHack—the quantum machine learning hackathon

Xanadu 72 Dec 21, 2022
MG-GCN: Scalable Multi-GPU GCN Training Framework

MG-GCN MG-GCN: multi-GPU GCN training framework. For more information, please read our paper. After cloning our repository, run git submodule update -

Translational Data Analytics (TDA) Lab @GaTech 6 Oct 24, 2022
PyArmadillo: an alternative approach to linear algebra in Python

PyArmadillo is a linear algebra library for the Python language, with an emphasis on ease of use.

Terry Zhuo 58 Oct 11, 2022
CS_Final_Metal_surface_detection - This is a final project for CoderSchool Machine Learning bootcamp on 29/12/2021.

CS_Final_Metal_surface_detection This is a final project for CoderSchool Machine Learning bootcamp on 29/12/2021. The project is based on the dataset

Cuong Vo 1 Dec 29, 2021
[NeurIPS'20] Multiscale Deep Equilibrium Models

Multiscale Deep Equilibrium Models 💥 💥 💥 💥 This repo is deprecated and we will soon stop actively maintaining it, as a more up-to-date (and simple

CMU Locus Lab 221 Dec 26, 2022
Unimodal Face Classification with Multimodal Training

Unimodal Face Classification with Multimodal Training This is a PyTorch implementation of the following paper: Unimodal Face Classification with Multi

Wenbin Teng 3 Jul 06, 2022
Physics-informed convolutional-recurrent neural networks for solving spatiotemporal PDEs

PhyCRNet Physics-informed convolutional-recurrent neural networks for solving spatiotemporal PDEs Paper link: [ArXiv] By: Pu Ren, Chengping Rao, Yang

Pu Ren 11 Aug 23, 2022
A Tensorflow based library for Time Series Modelling with Gaussian Processes

Markovflow Documentation | Tutorials | API reference | Slack What does Markovflow do? Markovflow is a Python library for time-series analysis via prob

Secondmind Labs 24 Dec 12, 2022
A Python package for faster, safer, and simpler ML processes

Bender 🤖 A Python package for faster, safer, and simpler ML processes. Why use bender? Bender will make your machine learning processes, faster, safe

Otovo 6 Dec 13, 2022
PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data.

Anti-Backdoor Learning PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data. The Anti-Backdoor Learning

Yige-Li 51 Dec 07, 2022