Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

Last update: Oct 17, 2022

Overview

Transfer-Learning-in-Reinforcement-Learning

Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

Final Report

Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

Cite this work

Nathan Beck, Abhiramon Rajasekharan, Hieu Tran, "Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations", 2021

Project description

Transfer learning approaches in reinforcement learning aim to assist agents in learning their target domains by leveraging the knowledge learned from other agents that have been trained on similar source domains. For example, recent research focus within this space has been placed on knowledge transfer between tasks that have different transition dynamics and reward functions; however, little focus has been placed on knowledge transfer between tasks that have different action spaces.

In this paper, we approach the task of transfer learning between domains that differ in action spaces. We present a reward shaping method based on source embedding similarity that is applicable to domains with both discrete and continuous action spaces. The efficacy of our approach is evaluated on transfer to restricted action spaces in the Acrobot-v1 and Pendulum-v0 domains (Brockman et al. 2016).

Our presentations

Presentation 1 here
Google Doc Folder here

Our Google Colab

https://colab.research.google.com/drive/1cQCV9Ko-prpB8sH6FlB4oj781On-ut_w?usp=sharing

Setup

Clone our repository
Install Gym

Using pip:

pip install gym

Or Building from Source

git clone https://github.com/openai/gym
cd gym
pip install -e .

How to run?

Run with python IDE

Open main.py or main_multiple_run.py
Modify env_name and algorithm that you want to run
Modify parameters in transfer_execute function if needed
Log will be printed out to the terminal and the plotting result will be shown on the new windows.

Run with Google Colab

Follow our sample in file Reward_Shaping_TL.ipynb to run your own colab.

Implemented Algorithms in Stable-Baseline3

Name	Recurrent	`Box`	`Discrete`	`MultiDiscrete`	`MultiBinary`	Multi Processing
A2C	❌	✔️	✔️	✔️	✔️	✔️
DDPG	❌	✔️	❌	❌	❌	❌
DQN	❌	❌	✔️	❌	❌	❌
HER	❌	✔️	✔️	❌	❌	❌
PPO	❌	✔️	✔️	✔️	✔️	✔️
SAC	❌	✔️	❌	❌	❌	❌
TD3	❌	✔️	❌	❌	❌	❌
QR-DQN¹	❌	❌	✔️	❌	❌	❌
TQC¹	❌	✔️	❌	❌	❌	❌
Maskable PPO¹	❌	❌	✔️	✔️	✔️	✔️

1: Implemented in SB3 Contrib GitHub repository.

Actions gym.spaces:

Box: A N-dimensional box that containes every point in the action space.
Discrete: A list of possible actions, where each timestep only one of the actions can be used.
MultiDiscrete: A list of possible actions, where each timestep only one action of each discrete set can be used.
MultiBinary: A list of possible actions, where each timestep any of the actions can be used in any combination.

Refercences

OpenAI Gym repo
OpenAI Gym website
Stable Baselines 3 repo
Robotschool repo
Gyem extension repos - This python package is an extension to OpenAI Gym for auxiliary tasks (multitask learning, transfer learning, inverse reinforcement learning, etc.)
Example code of TL in DL repo
Retro Contest - a transfer learning contest that measures a reinforcement learning algorithm’s ability to generalize from previous experience (hosted by OpenAI) link
Rainbow: Combining Improvements in Deep Reinforcement Learning (repo), (paper)
Experience replay (link)
Solving RL classic control (link)

Contributors

Nathan Beck [email protected]
Abhiramon Rajasekharan [email protected]
Trung Hieu Tran [email protected]

Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

Related tags

Overview

Transfer-Learning-in-Reinforcement-Learning

Final Report

Cite this work

Project description

Our presentations

Our Google Colab

Setup

How to run?

Run with python IDE

Run with Google Colab

Implemented Algorithms in Stable-Baseline3

Refercences

Related papers

Contributors

Owner

Trung Hieu Tran

A texturizer that I just made. Nothing special here.

Cards Against Humanity AI

Code for the paper: On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

PaSST: Efficient Training of Audio Transformers with Patchout

The personal repository of the work: DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer.

Simulating Sycamore quantum circuits classically using tensor network algorithm.

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

Aspect-Sentiment-Multiple-Opinion Triplet Extraction (NLPCC 2021)

Black box hyperparameter optimization made easy.

Utility code for use with PyXLL

chen2020iros: Learning an Overlap-based Observation Model for 3D LiDAR Localization.

Simple and ready-to-use tutorials for TensorFlow

Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.

Allows including an action inside another action (by preprocessing the Yaml file). This is how composite actions should have worked.

MGFN: Multi-Graph Fusion Networks for Urban Region Embedding was accepted by IJCAI-2022.

Code release for SLIP Self-supervision meets Language-Image Pre-training

Face Mesh is a face geometry solution that estimates 468 3D face landmarks in real-time even on mobile devices

Official Pytorch implementation of paper "Reverse Engineering of Generative Models: Inferring Model Hyperparameters from Generated Images"

Aligning Latent and Image Spaces to Connect the Unconnectable

4K videos with annotated masks in our ICCV2021 paper 'Internal Video Inpainting by Implicit Long-range Propagation'.

Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

Related tags

Overview

Transfer-Learning-in-Reinforcement-Learning

Final Report

Cite this work

Project description

Our presentations

Our Google Colab

Setup

How to run?

Run with python IDE

Run with Google Colab

Implemented Algorithms in Stable-Baseline3

Refercences

Related papers

Contributors

Owner

Trung Hieu Tran

A texturizer that I just made. Nothing special here.

Cards Against Humanity AI

Code for the paper: On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

PaSST: Efficient Training of Audio Transformers with Patchout

The personal repository of the work: *DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer*.

Simulating Sycamore quantum circuits classically using tensor network algorithm.

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

Aspect-Sentiment-Multiple-Opinion Triplet Extraction (NLPCC 2021)

Black box hyperparameter optimization made easy.

Utility code for use with PyXLL

chen2020iros: Learning an Overlap-based Observation Model for 3D LiDAR Localization.

Simple and ready-to-use tutorials for TensorFlow

Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.

Allows including an action inside another action (by preprocessing the Yaml file). This is how composite actions should have worked.

MGFN: Multi-Graph Fusion Networks for Urban Region Embedding was accepted by IJCAI-2022.

Code release for SLIP Self-supervision meets Language-Image Pre-training

Face Mesh is a face geometry solution that estimates 468 3D face landmarks in real-time even on mobile devices

Official Pytorch implementation of paper "Reverse Engineering of Generative Models: Inferring Model Hyperparameters from Generated Images"

Aligning Latent and Image Spaces to Connect the Unconnectable

4K videos with annotated masks in our ICCV2021 paper 'Internal Video Inpainting by Implicit Long-range Propagation'.

The personal repository of the work: DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer.