Code for "Offline Meta-Reinforcement Learning with Advantage Weighting" [ICML 2021]

Last update: Jan 01, 2023

Related tags

Overview

Offline Meta-Reinforcement Learning with Advantage Weighting (MACAW)

MACAW code used for the experiments in the ICML 2021 paper.

Installing the environment

# Install Python 3.7.9 if necessary
$ pyenv install 3.7.9
$ pyenv shell 3.7.9

$ python --version
Python 3.7.9

$ python -m venv env
$ source env/bin/activate
$ pip install -r requirements.txt

Downloading the data

The offline data used for MACAW can be found here. Download it and use the default name (macaw_offline_data) for the folder where the four data directories are stored. gDrive might be useful here if downloading from the Google Drive GUI is not an option.

Running MACAW 🦜

Run offline meta-training with periodic online evaluations with any of the scripts in scripts/. e.g.

$ . scripts/macaw_dir.sh # MACAW training on Cheetah-Direction (Figure 1)
$ . scripts/macaw_vel.sh # MACAW training on Cheetah-Velocity (Figure 1)
$ . scripts/macaw_quality_ablation.sh # Data quality ablation (Figure 5-left)
...

Outputs (tensorboard logs) will be written to the log/ directory.

Reach out!

If you're having issues with the code or data, feel free to open an issue or send me an email.

Citation

If our code or research was useful for your own work, you can cite us with the following attribution:

@InProceedings{mitchell2021offline,
    title = {Offline Meta-Reinforcement Learning with Advantage Weighting},
    author = {Mitchell, Eric and Rafailov, Rafael and Peng, Xue Bin and Levine, Sergey and Finn, Chelsea},
    booktitle = {Proceedings of the 38th International Conference on Machine Learning},
    year = {2021}
}

Code for "Offline Meta-Reinforcement Learning with Advantage Weighting" [ICML 2021]

Related tags

Overview

Offline Meta-Reinforcement Learning with Advantage Weighting (MACAW)

Installing the environment

Downloading the data

Running MACAW 🦜

Reach out!

Citation

Owner

Eric Mitchell

[ICCV 2021 Oral] NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo

Rasterize with the least efforts for researchers.

Constraint-based geometry sketcher for blender

Computer Vision Script to recognize first person motion, developed as final project for the course "Machine Learning and Deep Learning"

RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

AWS documentation corpus for zero-shot open-book question answering.

Arch-Net: Model Distillation for Architecture Agnostic Model Deployment

ZeroVL - The official implementation of ZeroVL

Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

scikit-learn inspired API for CRFsuite

Fast and robust clustering of point clouds generated with a Velodyne sensor.

links and status of cool gradio demos

Accurate identification of bacteriophages from metagenomic data using Transformer

The Face Mask recognition system uses AI technology to detect the person with or without a mask.

DAN: Unfolding the Alternating Optimization for Blind Super Resolution

[NeurIPS 2021] A weak-shot object detection approach by transferring semantic similarity and mask prior.

Exploring Classification Equilibrium in Long-Tailed Object Detection, ICCV2021

Get the partition that a file belongs and the percentage of space that consumes

You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks.

The code used for the free [email protected] Webinar series on Reinforcement Learning in Finance