Source code for Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Last update: Sep 16, 2022

Related tags

Overview

Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Official implementation of ACC, described in the paper "Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning". The source code is based on the pytorch implementation of TQC, which again is based on TD3. We thank the authors for making their source code publicly available.

Requirements

Install MuJoCo

Download and install MuJoCo 1.50 from the MuJoCo website. We assume that the MuJoCo files are extracted to the default location (~/.mujoco/mjpro150).
Copy your MuJoCo license key (mjkey.txt) to ~/.mujoco/mjkey.txt:

Install

We recommend to use an anaconda environment. In our experiments we used python 3.7 and the following dependencies

pip install gym==0.17.2 mujoco-py==1.50.1.68 numpy==1.19.1 torch==1.6.0 torchvision==0.7.0

Running ACC

You can run ACC for TQC on one of the gym continuous control environments by calling

python main.py --env "HalfCheetah-v3" --max_timesteps 5000000 --seed 0

To run the data efficient variant with 4 critic update steps per environment step you can call

python main.py --env "HalfCheetah-v3" --max_timesteps 1000000 --num_critic_updates 4 --seed 0

An example script that runs the experiments for 10 seeds and all environments is in run_experiment.sh and run_experiment_data_efficient.sh.

You can speed up the experiments by using fewer networks in the ensemble of TQC. This trades off a little bit of performance for a faster runtime (see the Appendix of the paper). The number of networks can be controlled with the flag --n_nets. For example

python main.py --env "HalfCheetah-v3" --max_timesteps 5000000 --n_nets 2--seed 0

Source code for Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Related tags

Overview

Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Requirements

Install MuJoCo

Install

Running ACC

Owner

Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks

Object detection using yolo-tiny model and opencv used as backend

How to use TensorLayer

Multi-task yolov5 with detection and segmentation based on yolov5

Unofficial implementation of the Involution operation from CVPR 2021

SGPT: Multi-billion parameter models for semantic search

URIE: Universal Image Enhancementfor Visual Recognition in the Wild

NCNN implementation of Real-ESRGAN. Real-ESRGAN aims at developing Practical Algorithms for General Image Restoration.

DeepFaceLab fork which provides IPython Notebook to use DFL with Google Colab

Replication Package for AequeVox:Automated Fariness Testing for Speech Recognition Systems

Official Pytorch implementation of "DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network" (CVPR'21)

Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

Code for DeepCurrents: Learning Implicit Representations of Shapes with Boundaries

🗺 General purpose U-Network implemented in Keras for image segmentation

Image-to-image regression with uncertainty quantification in PyTorch

Official repository for the paper "GN-Transformer: Fusing AST and Source Code information in Graph Networks".

U-2-Net: U Square Net - Modified for paired image training of style transfer

a morph transfer UGATIT for image translation.

MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

Automatic Idiomatic Expression Detection