PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

Last update: Dec 08, 2022

Related tags

Deep Learning idaac

Overview

IDAAC: Invariant Decoupled Advantage Actor-Critic

This is a PyTorch implementation of the methods proposed in

Decoupling Value and Policy for Generalization in Reinforcement Learning by

Roberta Raileanu and Rob Fergus.

Citation

If you use this code in your own work, please cite our paper:

@article{Raileanu2021DecouplingVA,
  title={Decoupling Value and Policy for Generalization in Reinforcement Learning},
  author={Roberta Raileanu and R. Fergus},
  journal={ArXiv},
  year={2021},
  volume={abs/2102.10330}
}

Requirements

To install all the required dependencies:

conda create -n idaac python=3.7
conda activate idaac

cd idaac
pip install -r requirements.txt

pip install procgen

git clone https://github.com/openai/baselines.git
cd baselines 
python setup.py install

Instructions

This repo provides instructions for training IDAAC, DAAC, and PPO on the Procgen benchmark.

Train IDAAC on CoinRun

python train.py --env_name coinrun --algo idaac

Train DAAC on CoinRun

python train.py --env_name coinrun --algo daac

Train PPO on CoinRun

python train.py --env_name coinrun --algo ppo --ppo_epoch 3

Note: The default code uses the same set of hyperparameters (HPs) for all environments, which are the best ones overall. In our studies, we've found some of the games can further benefit from slightly different HPs, so we provide those as well. To use the best hyperparameters for each environment, use the flag --use_best_hps.

Overview of DAAC and IDAAC

Procgen Results

IDAAC achieves state-of-the-art performance on the Procgen benchmark (easy mode), significantly improving the agent's generalization ability over standard RL methods such as PPO.

Test Results on Procgen

Acknowledgements

This code was based on an open sourced PyTorch implementation of PPO.

PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

Related tags

Overview

IDAAC: Invariant Decoupled Advantage Actor-Critic

Citation

Requirements

Instructions

Train IDAAC on CoinRun

Train DAAC on CoinRun

Train PPO on CoinRun

Overview of DAAC and IDAAC

Procgen Results

Acknowledgements

Owner

QA-GNN: Question Answering using Language Models and Knowledge Graphs

Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai

ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs

FPGA: Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification

免费获取http代理并生成proxifier配置文件

PINN(s): Physics-Informed Neural Network(s) for von Karman vortex street

[NeurIPS 2021] Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data

VR Viewport Pose Model for Quantifying and Exploiting Frame Correlations

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

10th place solution for Google Smartphone Decimeter Challenge at kaggle.

Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

Automatically erase objects in the video, such as logo, text, etc.

Neural Oblivious Decision Ensembles

Solve a Rubiks Cube using Python Opencv and Kociemba module

Neural network graphs and training metrics for PyTorch, Tensorflow, and Keras.

Text Generation by Learning from Demonstrations

Semi-supervised learning for object detection

An Implementation of SiameseRPN with Feature Pyramid Networks

Official implementation of "SinIR: Efficient General Image Manipulation with Single Image Reconstruction" (ICML 2021)

This repo contains the source code and a benchmark for predicting user's utilities with Machine Learning techniques for Computational Persuasion