Implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Environments.

Last update: Dec 23, 2021

Related tags

Overview

ALPHAMEPOL

This repository contains the implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Environments.

Installation

In order to use this codebase you need to work with a Python version >= 3.6. Moreover, you need to have a working setup of Mujoco with a valid Mujco license. To setup Mujoco, have a look here. To avoid any conflict with your existing Python setup, and to keep this project self-contained, it is suggested to work in a virtual environment with virtualenv. To install virtualenv:

pip install --upgrade virtualenv

Create a virtual environment, activate it and install the requirements:

virtualenv venv
source venv/bin/activate
pip install -r requirements.txt

Usage

Unsupervised Pre-Training

To reproduce the Unsupervised Pre-Training experiments in the paper, run:

./scripts/exploration/[gridworld_with_slope.sh | multigrid.sh | ant.sh | minigrid.sh]

Supervised Fine-Tuning

To reproduce the Supervised Fine-Tuning experiments, run:

./scripts/goal_rl/[gridworld_with_slope.sh | multigrid.sh | ant.sh | minigrid.sh]

By default, this will launch TRPO with ALPHAMEPOL initialization. To launch TRPO with a random initialization, simply omit the policy_init argument in the scripts.

Moreover, note that the scripts for the GridWorld with Slope and MultiGrid experiments have the argument num_goals = 50, meaning that the training will be performed with one goal at a time. If you want to speed up the process, you can use several processes (ideally one for each goal), by passing as argument num_goals = 1 and changing incrementally the seed. As regards the Ant and MiniGrid experiments, since the goals are predefined, you can also set the goal_index argument to specify a goal (from 0 to 7 and from 0 to 12 respectively).

Results Visualization

Once launched, each experiment will log statistics in the results folder. You can visualize everything by launching tensorboard targeting that directory:

python -m tensorboard.main --logdir=./results --port 8080

and visiting the board at http://localhost:8080.

Implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Environments.

Related tags

Overview

ALPHAMEPOL

Installation

Usage

Unsupervised Pre-Training

Supervised Fine-Tuning

Results Visualization

Owner

Pointer-generator - Code for the ACL 2017 paper Get To The Point: Summarization with Pointer-Generator Networks

Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

Unbiased Learning To Rank Algorithms (ULTRA)

Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

We present a regularized self-labeling approach to improve the generalization and robustness properties of fine-tuning.

BLEURT is a metric for Natural Language Generation based on transfer learning.

The code repository for "PyCIL: A Python Toolbox for Class-Incremental Learning" in PyTorch.

Implementation of "Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification"

Experiments on Flood Segmentation on Sentinel-1 SAR Imagery with Cyclical Pseudo Labeling and Noisy Student Training

A module for solving and visualizing Schrödinger equation.

Forecasting Nonverbal Social Signals during Dyadic Interactions with Generative Adversarial Neural Networks

you can add any codes in any language by creating its respective folder (if already not available).

Source code for paper "Deep Superpixel-based Network for Blind Image Quality Assessment"

[NeurIPS'20] Self-supervised Co-Training for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.

ML-based medical imaging using Azure

Implementation of Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning

[CVPR 2021] Unsupervised 3D Shape Completion through GAN Inversion

A JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.

A Pytree Module system for Deep Learning in JAX

Control-Robot-Arm-using-PS4-Controller - A Robotic Arm based on Raspberry Pi and Arduino that controlled by PS4 Controller