Learning Off-Policy with Online Planning, CoRL 2021

Last update: Nov 22, 2022

Related tags

Deep Learning LOOP

Overview

LOOP: Learning Off-Policy with Online Planning

Accepted in Conference of Robot Learning (CoRL) 2021.

Harshit Sikchi, Wenxuan Zhou, David Held

Paper

Install

PyTorch 1.5
OpenAI Gym
MuJoCo
tqdm
D4RL dataset

File Structure

LOOP (Core method)
- Training code (Online RL): train_loop_sac.py
- Training code (Offline RL): train_loop_offline.py
- Training code (safe RL): train_loop_safety.py
- Policies (online/offline/safety): policies.py
- ARC/H-step lookahead policy: controllers/
Environments: envs/
Configurations: configs/

Instructions

All the experiments are to be run under the root folder.
Config files in configs/ are used to specify hyperparameters for controllers and dynamics.
Please keep all the other values in yml files consistent with hyperparamters given in paper to reproduce the results in our paper.

Experiments

Sec 6.1 LOOP for Online RL

python train_loop_sac.py --env=<env_name> --policy=LOOP_SAC_ARC --start_timesteps=<initial exploration steps> --exp_name=<location_to_logs>

Environments wrappers with their termination condition can be found under envs/

Sec 6.2 LOOP for Offline RL

Download CRR trained models from Link into the root folder.

python train_loop_offline.py --env=<env_name> --policy=LOOP_OFFLINE_ARC --exp_name=<location_to_logs>  --offline_algo=CRR --prior_type=CRR

Currently supported for d4rl MuJoCo locomotions tasks only.

Sec 6.3 LOOP for Safe RL

python train_loop_safety.py --env=<env_name> --policy=safeLOOP_ARC --exp_name=<location_to_logs>

Safety environments can be found under envs/safety_envs.py

References

Parts of the codes are used from the references mentioned below:

@article{SpinningUp2018,
    author = {Achiam, Joshua},
    title = {{Spinning Up in Deep Reinforcement Learning}},
    year = {2018}
}

https://github.com/Xingyu-Lin/mbpo_pytorch

Comments

Environment reproducibility

Hi, I am trying to run your code. However, I am trying to get packages prepared on newest version and have been encountering errors such as with mpi4py which does not install correctly in my environment.

Is it possible for you guys to provide a requirements.txt file for me to generate the python virtual environment that will set up the dependencies to run the code? Otherwise a container image such as docker will also be great!

opened by pranjaldhole 0

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

UPDeT Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight) The

96 Dec 22, 2022

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

pytorch-a2c-ppo-acktr Update (April 12th, 2021) PPO is great, but Soft Actor Critic can be better for many continuous control tasks. Please check out

3k Jan 9, 2023

3k Dec 31, 2022

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

pair-emnlp2020 Official repository for the paper: Xinyu Hua and Lu Wang: PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long

31 Oct 13, 2022

Simple streamlit app to demonstrate HERE Tour Planning

Table of Contents About the Project Built With Getting Started Prerequisites Installation Usage Roadmap Contributing License Acknowledgements About Th

8 Sep 5, 2022

This is the official repo for TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations at CVPR'21. According to some product reasons, we are not planning to release the training/testing codes and models. However, we will release the dataset and the scripts to prepare the dataset.

TransFill-Reference-Inpainting This is the official repo for TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transf

80 Dec 8, 2022

An all-in-one application to visualize multiple different local path planning algorithms

Learning Off-Policy with Online Planning, CoRL 2021

Related tags

Overview

LOOP: Learning Off-Policy with Online Planning

Install

File Structure

Instructions

Experiments

Sec 6.1 LOOP for Online RL

Sec 6.2 LOOP for Offline RL

Sec 6.3 LOOP for Safe RL

References

You might also like...

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

Simple streamlit app to demonstrate HERE Tour Planning

An all-in-one application to visualize multiple different local path planning algorithms

GNPy: Optical Route Planning and DWDM Network Optimization

Memory-efficient optimum einsum using opt_einsum planning and PyTorch kernels.

Comments

Environment reproducibility

Releases(v0.0.0)

v0.0.0(Aug 27, 2022)

Owner

Harshit Sikchi

A Unified Generative Framework for Various NER Subtasks.

Justmagic - Use a function as a method with this mystic script, like in Nim

moving object detection for satellite videos.

TransVTSpotter: End-to-end Video Text Spotter with Transformer

Physics-Informed Neural Networks (PINN) and Deep BSDE Solvers of Differential Equations for Scientific Machine Learning (SciML) accelerated simulation

Combining Latent Space and Structured Kernels for Bayesian Optimization over Combinatorial Spaces

Implementation of Barlow Twins paper

Structured Data Gradient Pruning (SDGP)

Improving Convolutional Networks via Attention Transfer (ICLR 2017)

A real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body

Code for Subgraph Federated Learning with Missing Neighbor Generation (NeurIPS 2021)

This repository contains an implementation of the Permutohedral Attention Module in Pytorch

ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

Code for the IJCAI 2021 paper "Structure Guided Lane Detection"

VLGrammar: Grounded Grammar Induction of Vision and Language

This is a repository with the code for the ACL 2019 paper

Training data extraction on GPT-2

It's a implement of this paper：Relation extraction via Multi-Level attention CNNs

Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021)

Official PyTorch Implementation of paper EAN: Event Adaptive Network for Efficient Action Recognition