DrQ-v2: Improved Data-Augmented Reinforcement Learning

Last update: Jan 01, 2023

Related tags

Overview

DrQ-v2: Improved Data-Augmented RL Agent

Method

DrQ-v2 is a model-free off-policy algorithm for image-based continuous control. DrQ-v2 builds on DrQ, an actor-critic approach that uses data augmentation to learn directly from pixels. We introduce several improvements including:

Switch the base RL learner from SAC to DDPG.
Incorporate n-step returns to estimate TD error.
Introduce a decaying schedule for exploration noise.
Make implementation 3.5 times faster.
Find better hyper-parameters.

These changes allow us to significantly improve sample efficiency and wall-clock training time on a set of challening tasks from the DeepMind Control Suite compared to prior methods. Furthermore, DrQ-v2 is able to solve complex humanoid locomotion tasks directly from pixel observations, previously unattained by model-free RL.

Citation

If you use this repo in your research, please consider citing the paper as follows:

@article{yarats2021drqv2,
  title={Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning},
  author={Denis Yarats and Rob Fergus and Alessandro Lazaric and Lerrel Pinto},
  journal={arXiv preprint arXiv:},
  year={2021}
}

Instructions

Install dependencies:

conda env create -f conda_env.yml
conda activate drqv2

Train the agent:

python train.py task=quadruped_walk

Monitor results:

tensorboard --logdir exp_local

License

The majority of DrQ-v2 is licensed under the MIT license, however portions of the project are available under separate license terms: DeepMind is licensed under the Apache 2.0 license.

DrQ-v2: Improved Data-Augmented Reinforcement Learning

Related tags

Overview

DrQ-v2: Improved Data-Augmented RL Agent

Method

Citation

Instructions

License

Owner

Facebook Research

MapReader: A computer vision pipeline for the semantic exploration of maps at scale

A Runtime method overload decorator which should behave like a compiled language

This repository contains the code for TACL2021 paper: SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization

SUPERVISED-CONTRASTIVE-LEARNING-FOR-PRE-TRAINED-LANGUAGE-MODEL-FINE-TUNING - The Facebook paper about fine tuning RoBERTa with contrastive loss

Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

Boosted CVaR Classification (NeurIPS 2021)

Deeprl - Standard DQN and dueling network for simple games

This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Learning Time-Critical Responses for Interactive Character Control

The codes reproduce the figures and statistics in the paper, "Controlling for multiple covariates," by Mark Tygert.

Code for PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning

3D mesh stylization driven by a text input in PyTorch

A library that allows for inference on probabilistic models

Learning to Self-Train for Semi-Supervised Few-Shot

Code for the paper "There is no Double-Descent in Random Forests"

Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using 🤗 transformers

List of awesome things around semantic segmentation 🎉

Official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition" in AAAI2022.

Bilinear attention networks for visual question answering

Train a deep learning net with OpenStreetMap features and satellite imagery.