Source code for Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Last update: Sep 16, 2022

Related tags

Overview

Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Official implementation of ACC, described in the paper "Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning". The source code is based on the pytorch implementation of TQC, which again is based on TD3. We thank the authors for making their source code publicly available.

Requirements

Install MuJoCo

Download and install MuJoCo 1.50 from the MuJoCo website. We assume that the MuJoCo files are extracted to the default location (~/.mujoco/mjpro150).
Copy your MuJoCo license key (mjkey.txt) to ~/.mujoco/mjkey.txt:

Install

We recommend to use an anaconda environment. In our experiments we used python 3.7 and the following dependencies

pip install gym==0.17.2 mujoco-py==1.50.1.68 numpy==1.19.1 torch==1.6.0 torchvision==0.7.0

Running ACC

You can run ACC for TQC on one of the gym continuous control environments by calling

python main.py --env "HalfCheetah-v3" --max_timesteps 5000000 --seed 0

To run the data efficient variant with 4 critic update steps per environment step you can call

python main.py --env "HalfCheetah-v3" --max_timesteps 1000000 --num_critic_updates 4 --seed 0

An example script that runs the experiments for 10 seeds and all environments is in run_experiment.sh and run_experiment_data_efficient.sh.

You can speed up the experiments by using fewer networks in the ensemble of TQC. This trades off a little bit of performance for a faster runtime (see the Appendix of the paper). The number of networks can be controlled with the flag --n_nets. For example

python main.py --env "HalfCheetah-v3" --max_timesteps 5000000 --n_nets 2--seed 0

Source code for Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Related tags

Overview

Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Requirements

Install MuJoCo

Install

Running ACC

Owner

The official GitHub repository for the Argoverse 2 dataset.

SeisComP/SeisBench interface to enable deep-learning (re)picking in SeisComP

Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised manner.

Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently

RLMeta is a light-weight flexible framework for Distributed Reinforcement Learning Research.

Solving reinforcement learning tasks which require language and vision

Finetuner allows one to tune the weights of any deep neural network for better embeddings on search tasks

A lossless neural compression framework built on top of JAX.

Deeper insights into graph convolutional networks for semi-supervised learning

A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Code for "Searching for Efficient Multi-Stage Vision Transformers"

This is the repository for our paper SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking

Code to reproduce the results in the paper "Tensor Component Analysis for Interpreting the Latent Space of GANs".

Pytorch tutorials for Neural Style transfert

Finding all things on-prem Microsoft for password spraying and enumeration.

A curated list of automated deep learning (including neural architecture search and hyper-parameter optimization) resources.

Erpnext app for make employee salary on payroll entry based on one or more project with percentage for all project equal 100 %

Official Repository for "Robust On-Policy Data Collection for Data Efficient Policy Evaluation" (NeurIPS 2021 Workshop on OfflineRL).

A containerized REST API around OpenAI's CLIP model.

Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization