Code for "Hierarchical Skills for Efficient Exploration" HSD-3 Algorithm and Baselines

Related tags

Deep Learninghsd3
Overview

Hierarchical Skills for Efficient Exploration

This is the source code release for the paper Hierarchical Skills for Efficient Exploration. It contains

  • Code for pre-training and hierarchical learning with HSD-3
  • Code for the baselines we compare to in the paper

Additionally, we provide pre-trained skill policies for the Walker and Humanoid robots considered in the paper.

The benchmark suite can be found in a standalone repository at facebookresearch/bipedal-skills

Prerequisites

Install PyTorch according to the official instructions, for example in a new conda environment. This code-base was tested with PyTorch 1.8 and 1.9.

Then, install remaining requirements via

pip install -r requirements.txt

For optimal performance, we also recommend installing NVidia's PyTorch extensions.

Usage

We use Hydra to handle training configurations, with some defaults that might not make everyone happy. In particular, we disable the default job directory management -- which is good for local development but not desirable for running full experiments. This can be changed by adapting the initial portion of config/common.yaml or by passing something like hydra.run.dir=./outputs/my-custom-string to the commands below.

Pre-training Hierarchical Skills

For pre-training skill policies, use the pretrain.py script (note that this requires a machine with 2 GPUs):

# Walker robot
python pretrain.py -cn walker_pretrain
# Humanoid robot
python pretrain.py -cn humanoid_pretrain

Hierarchical Control

High-level policy training with HSD-3 is done as follows:

# Walker robot
python train.py -cn walker_hsd3
# Humanoid robot
python train.py -cn humanoid_hsd3

The default configuration assumes that a pre-trained skill policy is available at checkpoint-lo.pt. The location can be overriden by setting a new value for agent.lo.init_from (see below for an example). By default, a high-level agent will be trained on the "Hurdles" task. This can be changed by passing env.name=BiskStairs-v1, for example.

Pre-trained skill policies are available here. After unpacking the archive in the top-level directory of this repository, they can be used as follows:

# Walker robot
python train.py -cn walker_hsd3 agent.lo.init_from=$PWD/pretrained-skills/walker.pt
# Humanoid robot
python train.py -cn humanoid_hsd3 agent.lo.init_from=$PWD/pretrained-skills/humanoidpc.pt

Baselines

Individual baselines can be run by passing the following as the -cn argument to train.py (for the Walker robot):

Baseline Configuration name
Soft Actor-Critic walker_sac
DIAYN-C pre-training walker_diaync_pretrain
DIAYN-C HRL walker_diaync_hrl
HIRO-SAC walker_hiro
Switching Ensemble walker_se
HSD-Bandit walker_hsdb
SD walker_sd

By default, walker_sd will select the full goal space. Other goal spaces can be selected by modifying the configuration, e.g., passing subsets=2-3+4 will limit high-level control to X translation (2) and the left foot (3+4).

License

hsd3 is MIT licensed, as found in the LICENSE file.

なりすまし検出(anti-spoof-mn3)のWebカメラ向けデモ

FaceDetection-Anti-Spoof-Demo なりすまし検出(anti-spoof-mn3)のWebカメラ向けデモです。 モデルはPINTO_model_zoo/191_anti-spoof-mn3からONNX形式のモデルを使用しています。 Requirement mediapipe

KazuhitoTakahashi 8 Nov 18, 2022
WiFi-based Multi-task Sensing

WiFi-based Multi-task Sensing Introduction WiFi-based sensing has aroused immense attention as numerous studies have made significant advances over re

zhangx289 6 Nov 24, 2022
This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Intro This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales Vehicle Sam

39 Jul 21, 2022
Source code for PairNorm (ICLR 2020)

PairNorm Official pytorch source code for PairNorm paper (ICLR 2020) This code requires pytorch_geometric=1.3.2 usage For SGC, we use original PairNo

62 Dec 08, 2022
[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning DouZero is a reinforcement learning framework for DouDizhu (斗地主), t

Kwai Inc. 3.1k Jan 04, 2023
PushForKiCad - AISLER Push for KiCad EDA

AISLER Push for KiCad Push your layout to AISLER with just one click for instant

AISLER 31 Dec 29, 2022
One Million Scenes for Autonomous Driving

ONCE Benchmark This is a reproduced benchmark for 3D object detection on the ONCE (One Million Scenes) dataset. The code is mainly based on OpenPCDet.

148 Dec 28, 2022
ExCon: Explanation-driven Supervised Contrastive Learning

ExCon: Explanation-driven Supervised Contrastive Learning Link to the paper: https://arxiv.org/pdf/2111.14271.pdf Contributors of this repo: Zhibo Zha

Zhibo (Darren) Zhang 18 Nov 01, 2022
Active learning for Mask R-CNN in Detectron2

MaskAL - Active learning for Mask R-CNN in Detectron2 Summary MaskAL is an active learning framework that automatically selects the most-informative i

49 Dec 20, 2022
Suite of 500 procedurally-generated NLP tasks to study language model adaptability

TaskBench500 The TaskBench500 dataset and code for generating tasks. Data The TaskBench dataset is available under wget http://web.mit.edu/bzl/www/Tas

Belinda Li 20 May 17, 2022
Lunar is a neural network aimbot that uses real-time object detection accelerated with CUDA on Nvidia GPUs.

Lunar Lunar is a neural network aimbot that uses real-time object detection accelerated with CUDA on Nvidia GPUs. About Lunar can be modified to work

Zeyad Mansour 276 Jan 07, 2023
Official repository for the NeurIPS 2021 paper Get Fooled for the Right Reason: Improving Adversarial Robustness through a Teacher-guided curriculum Learning Approach

Get Fooled for the Right Reason Official repository for the NeurIPS 2021 paper Get Fooled for the Right Reason: Improving Adversarial Robustness throu

Sowrya Gali 1 Apr 25, 2022
Breaking the Dilemma of Medical Image-to-image Translation

Breaking the Dilemma of Medical Image-to-image Translation Supervised Pix2Pix and unsupervised Cycle-consistency are two modes that dominate the field

Kid Liet 86 Dec 21, 2022
Removing Inter-Experimental Variability from Functional Data in Systems Neuroscience

Removing Inter-Experimental Variability from Functional Data in Systems Neuroscience This repository is the official implementation of [https://www.bi

Eulerlab 6 Oct 09, 2022
Public scripts, services, and configuration for running a smart home K3S network cluster

makerhouse_network Public scripts, services, and configuration for running MakerHouse's home network. This network supports: TODO features here For mo

Scott Martin 1 Jan 15, 2022
Toward Spatially Unbiased Generative Models (ICCV 2021)

Toward Spatially Unbiased Generative Models Implementation of Toward Spatially Unbiased Generative Models (ICCV 2021) Overview Recent image generation

Jooyoung Choi 88 Dec 01, 2022
External Attention Network

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks paper : https://arxiv.org/abs/2105.02358 Jittor code will come soon

MenghaoGuo 357 Dec 11, 2022
This is the official github repository of the Met dataset

The Met dataset This is the official github repository of the Met dataset. The official webpage of the dataset can be found here. What is it? This cod

Nikolaos-Antonios Ypsilantis 35 Dec 17, 2022
Deep Learning Interviews book: Hundreds of fully solved job interview questions from a wide range of key topics in AI.

This book was written for you: an aspiring data scientist with a quantitative background, facing down the gauntlet of the interview process in an increasingly competitive field. For most of you, the

4.1k Dec 28, 2022
SPT_LSA_ViT - Implementation for Visual Transformer for Small-size Datasets

Vision Transformer for Small-Size Datasets Seung Hoon Lee and Seunghyun Lee and Byung Cheol Song | Paper Inha University Abstract Recently, the Vision

Lee SeungHoon 87 Jan 01, 2023