Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Last update: Apr 12, 2022

Related tags

Overview

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Beining Han, Chongyi Zheng, Harris Chan, Keiran Paster, Michael R. Zhang, Jimmy Ba

Summary: Deep Reinforcement Learning agents often face unanticipated environmental changes after deployment in the real world. These changes are often spurious and unrelated to the underlying problem, such as background shifts for visual input agents. Unfortunately, deep RL policies are usually sensitive to these changes and fail to act robustly against them. This resembles the problem of domain generalization in supervised learning. In this work, we study this problem for goal-conditioned RL agents. We propose a theoretical framework in the Block MDP setting that characterizes the generalizability of goal-conditioned policies to new environments. Under this framework, we develop a practical method PA-SkewFit (PASF) that enhances domain generalization.

@article{han2021learning,
  title={Learning Domain Invariant Representations in Goal-conditioned Block MDPs},
  author={Han, Beining and Zheng, Chongyi and Chan, Harris and Paster, Keiran and Zhang, Michael and Ba, Jimmy},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}

Installation

Our code was adapted from rlkit and was tested on a Ubuntu 20.04 server.

This instruction assumes that you have already installed NVIDIA driver, Anaconda, and MuJoCo.

You'll need to get your own MuJoCo key if you want to use MuJoCo.

1. Create Anaconda environment

Install the included Anaconda environment

$ conda env create -f environment/pasf_env.yml
$ source activate pasf_env
(pasf_env) $ python

2. Download the goals

Download the goals from the following link and put it here: (PASF DIR)/multiworld/envs/mujoco.

https://drive.google.com/drive/folders/1L9SYFADWmFzdP1c6wf2yo2WjOlXJh8Iu?usp=sharing

$ ls (PASF DIR)/multiworld/envs/mujoco
... goals ...

(Optional) Speed up with GPU rendering

3. (Optional) Speed-up with GPU rendering

Note: GPU rendering for mujoco-py speeds up training a lot but consumes more GPU memory at the same time.

Check this Issues:

Remember to do this stuff with the mujoco-py package inside of your pasf_env.

Running Experiments

The following command run the PASF experiments for the four tasks: Reach, Door, Push, Pickup, in the learning curve respectively.

$ source activate pasf_env
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_reach_lc_exp.bash
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_door_lc_exp.bash
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_push_lc_exp.bash
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_pickup_lc_exp.bash

The bash scripts only set , , and with the exact values we used for LC. But you can play with other hyperparameters in python scripts under (PASF DIR)/experiment.
Training and evaluation environments are chosen in python scripts for each task. You can find the backgrounds in (PASF DIR)/multiworld/core/background and domains in (PASF DIR)/multiworld/envs/assets/sawyer_xyz.
Results are recorded in progress.csv under (PASF DIR)/data/ and variant.json contains configuration for each experiment.
We simply set random seeds as 0, 1, 2, etc., and run experiments with 6-9 different seeds for each task.
Error and output logs can be found in (PASF DIR)/terminal_log.

Questions

If you have any questions, comments, or suggestions, please reach out to Beining Han ([email protected]) and Chongyi Zheng ([email protected]).

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Related tags

Overview

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Installation

1. Create Anaconda environment

2. Download the goals

3. (Optional) Speed-up with GPU rendering

Running Experiments

Questions

Owner

Chongyi Zheng

Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving

Doing fast searching of nearest neighbors in high dimensional spaces is an increasingly important problem

Image process framework based on plugin like imagej, it is esay to glue with scipy.ndimage, scikit-image, opencv, simpleitk, mayavi...and any libraries based on numpy

A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Tracing Versus Freehand for Evaluating Computer-Generated Drawings (SIGGRAPH 2021)

The Submission for SIMMC 2.0 Challenge 2021

🤖 A Python library for learning and evaluating knowledge graph embeddings

Video-face-extractor - Video face extractor with Python

AdaFocus (ICCV 2021) Adaptive Focus for Efficient Video Recognition

MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021)

You Only Look One-level Feature (YOLOF), CVPR2021, Detectron2

Use of Attention Gates in a Convolutional Neural Network / Medical Image Classification and Segmentation

A task Provided by A respective Artenal Ai and Ml based Company to complete it

Speed-Test - You can check your intenet speed using this tool

The official repository for "Revealing unforeseen diagnostic image features with deep learning by detecting cardiovascular diseases from apical four-chamber ultrasounds"

Code needed to reproduce the examples found in "The Temporal Robustness of Stochastic Signals"

Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions

Depth-Aware Video Frame Interpolation (CVPR 2019)

Repository for publicly available deep learning models developed in Rosetta community

Global-Local Context Network for Person Search