Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Related tags

Deep LearningPASF
Overview

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Beining Han,   Chongyi Zheng,   Harris Chan,   Keiran Paster,   Michael R. Zhang,   Jimmy Ba

paper

Summary: Deep Reinforcement Learning agents often face unanticipated environmental changes after deployment in the real world. These changes are often spurious and unrelated to the underlying problem, such as background shifts for visual input agents. Unfortunately, deep RL policies are usually sensitive to these changes and fail to act robustly against them. This resembles the problem of domain generalization in supervised learning. In this work, we study this problem for goal-conditioned RL agents. We propose a theoretical framework in the Block MDP setting that characterizes the generalizability of goal-conditioned policies to new environments. Under this framework, we develop a practical method PA-SkewFit (PASF) that enhances domain generalization.

@article{han2021learning,
  title={Learning Domain Invariant Representations in Goal-conditioned Block MDPs},
  author={Han, Beining and Zheng, Chongyi and Chan, Harris and Paster, Keiran and Zhang, Michael and Ba, Jimmy},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}

Installation

Our code was adapted from rlkit and was tested on a Ubuntu 20.04 server.

This instruction assumes that you have already installed NVIDIA driver, Anaconda, and MuJoCo.

You'll need to get your own MuJoCo key if you want to use MuJoCo.

1. Create Anaconda environment

Install the included Anaconda environment

$ conda env create -f environment/pasf_env.yml
$ source activate pasf_env
(pasf_env) $ python

2. Download the goals

Download the goals from the following link and put it here: (PASF DIR)/multiworld/envs/mujoco.

$ ls (PASF DIR)/multiworld/envs/mujoco
... goals ... 
  1. (Optional) Speed up with GPU rendering

3. (Optional) Speed-up with GPU rendering

Note: GPU rendering for mujoco-py speeds up training a lot but consumes more GPU memory at the same time.

Check this Issues:

Remember to do this stuff with the mujoco-py package inside of your pasf_env.

Running Experiments

The following command run the PASF experiments for the four tasks: Reach, Door, Push, Pickup, in the learning curve respectively.

$ source activate pasf_env
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_reach_lc_exp.bash
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_door_lc_exp.bash
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_push_lc_exp.bash
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_pickup_lc_exp.bash
  • The bash scripts only set equation, equation, and equation with the exact values we used for LC. But you can play with other hyperparameters in python scripts under (PASF DIR)/experiment.

  • Training and evaluation environments are chosen in python scripts for each task. You can find the backgrounds in (PASF DIR)/multiworld/core/background and domains in (PASF DIR)/multiworld/envs/assets/sawyer_xyz.

  • Results are recorded in progress.csv under (PASF DIR)/data/ and variant.json contains configuration for each experiment.

  • We simply set random seeds as 0, 1, 2, etc., and run experiments with 6-9 different seeds for each task.

  • Error and output logs can be found in (PASF DIR)/terminal_log.

Questions

If you have any questions, comments, or suggestions, please reach out to Beining Han ([email protected]) and Chongyi Zheng ([email protected]).

Owner
Chongyi Zheng
Chongyi Zheng
CCPD: a diverse and well-annotated dataset for license plate detection and recognition

CCPD (Chinese City Parking Dataset, ECCV) UPdate on 10/03/2019. CCPD Dataset is now updated. We are confident that images in subsets of CCPD is much m

detectRecog 1.8k Dec 30, 2022
An official implementation of "Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation" (CVPR 2021) in PyTorch.

BANA This is the implementation of the paper "Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation". For more inf

CV Lab @ Yonsei University 59 Dec 12, 2022
Half Instance Normalization Network for Image Restoration

HINet Half Instance Normalization Network for Image Restoration, based on https://github.com/megvii-model/HINet. Dependencies NumPy PyTorch, preferabl

Holy Wu 4 Jun 06, 2022
This is the official repository of Music Playlist Title Generation: A Machine-Translation Approach.

PlyTitle_Generation This is the official repository of Music Playlist Title Generation: A Machine-Translation Approach. The paper has been accepted by

SeungHeonDoh 6 Jan 03, 2022
Crab is a flexible, fast recommender engine for Python that integrates classic information filtering recommendation algorithms in the world of scientific Python packages (numpy, scipy, matplotlib).

Crab - A Recommendation Engine library for Python Crab is a flexible, fast recommender engine for Python that integrates classic information filtering r

python-recsys 1.2k Dec 21, 2022
Joint Learning of 3D Shape Retrieval and Deformation, CVPR 2021

Joint Learning of 3D Shape Retrieval and Deformation Joint Learning of 3D Shape Retrieval and Deformation Mikaela Angelina Uy, Vladimir G. Kim, Minhyu

Mikaela Uy 38 Oct 18, 2022
Pytorch Implementation of the paper "Cross-domain Correspondence Learning for Exemplar-based Image Translation"

CoCosNet Pytorch Implementation of the paper "Cross-domain Correspondence Learning for Exemplar-based Image Translation" (CVPR 2020 oral). Update: 202

Lingbo Yang 38 Sep 22, 2021
ReGAN: Sequence GAN using RE[INFORCE|LAX|BAR] based PG estimators

Sequence Generation with GANs trained by Gradient Estimation Requirements: PyTorch v0.3 Python 3.6 CUDA 9.1 (For GPU) Origin The idea is from paper Se

40 Nov 03, 2022
a dnn ai project to classify which food people are eating on audio recordings

Deep Learning - EAT Challenge About This project is part of an AI challenge of the DeepLearning course 2021 at the University of Augsburg. The objecti

Marco Tröster 1 Oct 24, 2021
Elevation Mapping on GPU.

Elevation Mapping cupy Overview This is a ros package of elevation mapping on GPU. Code are written in python and uses cupy for GPU calculation. * pla

Robotic Systems Lab - Legged Robotics at ETH Zürich 183 Dec 19, 2022
2D Human Pose estimation using transformers. Implementation in Pytorch

PE-former: Pose Estimation Transformer Vision transformer architectures perform very well for image classification tasks. Efforts to solve more challe

Panteleris Paschalis 23 Oct 17, 2022
This repository contains python code necessary to replicated the experiments performed in our paper "Invariant Ancestry Search"

InvariantAncestrySearch This repository contains python code necessary to replicated the experiments performed in our paper "Invariant Ancestry Search

Phillip Bredahl Mogensen 0 Feb 02, 2022
Creating Artificial Life with Reinforcement Learning

Although Evolutionary Algorithms have shown to result in interesting behavior, they focus on learning across generations whereas behavior could also be learned during ones lifetime.

Maarten Grootendorst 49 Dec 21, 2022
A Python module for parallel optimization of expensive black-box functions

blackbox: A Python module for parallel optimization of expensive black-box functions What is this? A minimalistic and easy-to-use Python module that e

Paul Knysh 426 Dec 08, 2022
This is the official implementation for "Do Transformers Really Perform Bad for Graph Representation?".

Graphormer By Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng*, Guolin Ke, Di He*, Yanming Shen and Tie-Yan Liu. This repo is the official impl

Microsoft 1.3k Dec 29, 2022
This repository contains a PyTorch implementation of the paper Learning to Assimilate in Chaotic Dynamical Systems.

Amortized Assimilation This repository contains a PyTorch implementation of the paper Learning to Assimilate in Chaotic Dynamical Systems. Abstract: T

4 Aug 16, 2022
Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021)

Pano-AVQA Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021) [Paper] [Poster] [Video] Getting Starte

Heeseung Yun 9 Dec 23, 2022
Subpopulation detection in high-dimensional single-cell data

PhenoGraph for Python3 PhenoGraph is a clustering method designed for high-dimensional single-cell data. It works by creating a graph ("network") repr

Dana Pe'er Lab 42 Sep 05, 2022
PG2Net: Personalized and Group PreferenceGuided Network for Next Place Prediction

PG2Net PG2Net:Personalized and Group Preference Guided Network for Next Place Prediction Datasets Experiment results on two Foursquare check-in datase

Urban Mobility 5 Dec 20, 2022
[TPAMI 2021] iOD: Incremental Object Detection via Meta-Learning

Incremental Object Detection via Meta-Learning To appear in an upcoming issue of the IEEE Transactions on Pattern Analysis and Machine Intelligence (T

Joseph K J 66 Jan 04, 2023