Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Related tags

Deep LearningPASF
Overview

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Beining Han,   Chongyi Zheng,   Harris Chan,   Keiran Paster,   Michael R. Zhang,   Jimmy Ba

paper

Summary: Deep Reinforcement Learning agents often face unanticipated environmental changes after deployment in the real world. These changes are often spurious and unrelated to the underlying problem, such as background shifts for visual input agents. Unfortunately, deep RL policies are usually sensitive to these changes and fail to act robustly against them. This resembles the problem of domain generalization in supervised learning. In this work, we study this problem for goal-conditioned RL agents. We propose a theoretical framework in the Block MDP setting that characterizes the generalizability of goal-conditioned policies to new environments. Under this framework, we develop a practical method PA-SkewFit (PASF) that enhances domain generalization.

@article{han2021learning,
  title={Learning Domain Invariant Representations in Goal-conditioned Block MDPs},
  author={Han, Beining and Zheng, Chongyi and Chan, Harris and Paster, Keiran and Zhang, Michael and Ba, Jimmy},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}

Installation

Our code was adapted from rlkit and was tested on a Ubuntu 20.04 server.

This instruction assumes that you have already installed NVIDIA driver, Anaconda, and MuJoCo.

You'll need to get your own MuJoCo key if you want to use MuJoCo.

1. Create Anaconda environment

Install the included Anaconda environment

$ conda env create -f environment/pasf_env.yml
$ source activate pasf_env
(pasf_env) $ python

2. Download the goals

Download the goals from the following link and put it here: (PASF DIR)/multiworld/envs/mujoco.

$ ls (PASF DIR)/multiworld/envs/mujoco
... goals ... 
  1. (Optional) Speed up with GPU rendering

3. (Optional) Speed-up with GPU rendering

Note: GPU rendering for mujoco-py speeds up training a lot but consumes more GPU memory at the same time.

Check this Issues:

Remember to do this stuff with the mujoco-py package inside of your pasf_env.

Running Experiments

The following command run the PASF experiments for the four tasks: Reach, Door, Push, Pickup, in the learning curve respectively.

$ source activate pasf_env
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_reach_lc_exp.bash
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_door_lc_exp.bash
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_push_lc_exp.bash
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_pickup_lc_exp.bash
  • The bash scripts only set equation, equation, and equation with the exact values we used for LC. But you can play with other hyperparameters in python scripts under (PASF DIR)/experiment.

  • Training and evaluation environments are chosen in python scripts for each task. You can find the backgrounds in (PASF DIR)/multiworld/core/background and domains in (PASF DIR)/multiworld/envs/assets/sawyer_xyz.

  • Results are recorded in progress.csv under (PASF DIR)/data/ and variant.json contains configuration for each experiment.

  • We simply set random seeds as 0, 1, 2, etc., and run experiments with 6-9 different seeds for each task.

  • Error and output logs can be found in (PASF DIR)/terminal_log.

Questions

If you have any questions, comments, or suggestions, please reach out to Beining Han ([email protected]) and Chongyi Zheng ([email protected]).

Owner
Chongyi Zheng
Chongyi Zheng
retweet 4 satoshi ⚡️

rt4sat retweet 4 satoshi This bot is the codebase for https://twitter.com/rt4sat please feel free to create an issue if you saw any bugs basically thi

6 Sep 30, 2022
A Unified Framework and Analysis for Structured Knowledge Grounding

UnifiedSKG 📚 : Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models Code for paper UnifiedSKG: Unifying and Mu

HKU NLP Group 370 Dec 21, 2022
Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs

Implementation for the paper: Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs, Nurendra Choudhary, Nikhil Rao, Sumeet Ka

Nurendra Choudhary 8 Nov 15, 2022
Two-stage CenterNet

Probabilistic two-stage detection Two-stage object detectors that use class-agnostic one-stage detectors as the proposal network. Probabilistic two-st

Xingyi Zhou 1.1k Jan 03, 2023
This is the repository of the NeurIPS 2021 paper "Curriculum Disentangled Recommendation withNoisy Multi-feedback"

Curriculum_disentangled_recommendation This is the repository of the NeurIPS 2021 paper "Curriculum Disentangled Recommendation with Noisy Multi-feedb

14 Dec 20, 2022
Try out deep learning models online on Google Colab

Try out deep learning models online on Google Colab

Erdene-Ochir Tuguldur 1.5k Dec 27, 2022
🚀 An end-to-end ML applications using PyTorch, W&B, FastAPI, Docker, Streamlit and Heroku

🚀 An end-to-end ML applications using PyTorch, W&B, FastAPI, Docker, Streamlit and Heroku

Made With ML 82 Jun 26, 2022
Code for our CVPR2021 paper coordinate attention

Coordinate Attention for Efficient Mobile Network Design (preprint) This repository is a PyTorch implementation of our coordinate attention (will appe

Qibin (Andrew) Hou 726 Jan 05, 2023
ETMO: Evolutionary Transfer Multiobjective Optimization

ETMO: Evolutionary Transfer Multiobjective Optimization To promote the research on ETMO, benchmark problems are of great importance to ETMO algorithm

Songbai Liu 0 Mar 16, 2021
Implementation of the final project of the course DDA6309 Probabilistic Graphical Model

Task-aware Joint CWS and POS (TCwsPos) This is the implementation of the final project of the course DDA6309 Probabilistic Graphical Models, The Chine

Peng 1 Dec 26, 2021
Classifying audio using Wavelet transform and deep learning

Audio Classification using Wavelet Transform and Deep Learning A step-by-step tutorial to classify audio signals using continuous wavelet transform (C

Aditya Dutt 17 Nov 29, 2022
Nsdf: A mesh SDF with just some code we can directly paste into our raymarcher

nsdf Representing SDFs of arbitrary meshes has been a bit tricky so far. Express

Jan Ivanecky 5 Feb 18, 2022
RL-driven agent playing tic-tac-toe on starknet against challengers.

tictactoe-on-starknet RL-driven agent playing tic-tac-toe on starknet against challengers. GUI reference: https://pythonguides.com/create-a-game-using

21 Jul 30, 2022
Evaluating deep transfer learning for whole-brain cognitive decoding

Evaluating deep transfer learning for whole-brain cognitive decoding This README file contains the following sections: Project description Repository

Armin Thomas 5 Oct 31, 2022
Randomized Correspondence Algorithm for Structural Image Editing

===================================== README: Inpainting based PatchMatch ===================================== @Author: Younesse ANDAM @Conta

Younesse 116 Dec 24, 2022
A transformer which can randomly augment VOC format dataset (both image and bbox) online.

VocAug It is difficult to find a script which can augment VOC-format dataset, especially the bbox. Or find a script needs complex requirements so it i

Coder.AN 1 Mar 05, 2022
Image Segmentation Evaluation

Image Segmentation Evaluation Martin Keršner, [email protected] Evaluation

Martin Kersner 273 Oct 28, 2022
Integrated Semantic and Phonetic Post-correction for Chinese Speech Recognition

Integrated Semantic and Phonetic Post-correction for Chinese Speech Recognition | paper | dataset | pretrained detection model | Authors: Yi-Chang Che

Yi-Chang Chen 1 Aug 23, 2022
Attentive Implicit Representation Networks (AIR-Nets)

Attentive Implicit Representation Networks (AIR-Nets) Preprint | Supplementary | Accepted at the International Conference on 3D Vision (3DV) teaser.mo

29 Dec 07, 2022
Computer vision - fun segmentation experience using classic and deep tools :)

Computer_Vision_Segmentation_Fun Segmentation of Images and Video. Tools: pytorch Models: Classic model - GrabCut Deep model - Deeplabv3_resnet101 Flo

Mor Ventura 1 Dec 18, 2021